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Preface 



This volume contains the proceedings of the 11th International Conference on 
Rewriting Techniques and Applications. The conference was held July 10-12, 
2000, at the University of East Anglia, Norwich, U.K. It is the major forum for 
the presentation of research on all theoretical and practical aspects of rewriting. 
Information about previous RTA conferences can be found at 

http: / /rewriting. loria.fr /rta/ 

and information about the general research area of rewriting at 
http : / / WWW . loria. fr / vigneron/RewritingHP / 

The program committee selected 18 papers, including three system descrip- 
tions, from a total of 44 submissions. In addition the program included invited 
talks by Jose Meseguer, Dale Miller, and Andrei Voronkov; and an invited tuto- 
rial by Sophie Tison. 

Many people contributed to RTA-2000 and I would like to express my sincere 
thanks to all of them. I am grateful to the program committee members and 
the external referees for reviewing the submissions and maintaining the high 
standard of the RTA conferences; to Richard Kennaway, who was responsible 
for the local arrangements for the conference; and to Jose Meseguer, the RTA 
publicity chair. It is a particular pleasure to thank Ashish Tiwari for his extensive 
assistance in many of my tasks as the program chair. Finally, I wish to thank 
the School of Information Systems at the University of East Anglia both for 
financial support and for providing the facilities. 



May 2000 



Leo Bachmair 
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Rewriting Logic and Maude: 
Concepts and Applications* 



Jose Meseguer 

Computer Science Laboratory 
SRI International, Menlo Park, CA 94025, USA 



Abstract. For the most part, rewriting techniques have been devel- 
oped and applied to support efficient equational reasoning and equa- 
tional specification, verification, and programming. Therefore, a rewrite 
rule t — > t' has been usually interpreted as a directed equation t = t' . 
Rewriting logic is a substantial broadening of the semantics given to 
rewrite rules. The equational reading is abandoned, in favor of a more 
dynamic interpretation. There are now in fact two complementary read- 
ings of a rule t — > t' , one computational, and another logical: (i) com- 
putationally, the rewrite rule t — > t' is interpreted as a local transition 
in a concurrent system; (ii) logically, the rewrite rule t — > t' is inter- 
preted as an inference rule. The experience gained so far strongly sug- 
gest that rewriting is indeed a very flexible and general formalism for 
both computational and logical applications. This means that from the 
computational point of view rewriting logic is a very expressive seman- 
tic framework, in which many different models of concurrency, languages, 
and distributed systems can be specified and programmed; and that from 
a logical point of view is a general logical framework in which many dif- 
ferent logics can be represented and implemented. This paper introduces 
the main concepts of rewriting logic and of the Maude rewriting logic 
language, and discusses a wide range of semantic framework and logical 
framework applications that have been developed in rewriting logic using 
Maude. 



1 Introduction 

For the most part, rewriting techniques have been developed and applied to 
support efficient equational reasoning and equational specification, verification, 
and programming. Therefore, a rewrite rule t — > t' has been usually interpreted 
as a directed equation t = t' . This equational semantics does of course suggest 
long-term research directions to advance the applicability of rewriting techniques 
to equational reasoning. For example, notions of confluence and termination are 
central for equational purposes, and completion techniques for different variants 

* Supported by DARPA through Rome Laboratories Contract F30602-C-0312, by 
DARPA and NASA through Contract NAS2-98073, by Office of Naval Research Con- 
tract N00014-99-C-0198, and by National Science Foundation Grant CCR-9900334. 



L. Bachmair (Ed.): RTA 2000, LNCS 1833, pp. 1—26, 2000. 
(c) Springer- Verlag Berlin Heidelberg 2000 
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of equational logic, for rewriting modulo axioms, and in support of theorem 
proving methods are important ongoing research topics. 

Rewriting logic [61] is a substantial broadening of the semantics given to 
rewrite rules. The equational reading is abandoned, in favor of a more dynamic, 
state-changing, and in general irreversible interpretation. There are now in fact 
two complementary readings of a rule t — > t' , one computational, and another 
logical: 



— computationally, the rewrite rule t — > t' is interpreted as a local transition 
in a concurrent system; that is, t and t' describe patterns for fragments of the 
distributed state of a system, and the rule explains how a local concurrent 
transition can take place in such a system, changing the local state fragment 
from the pattern t to the pattern t' . 

— logically, the rewrite rule t — > t' is interpreted as an inference rule, so that 
we can infer formulas of the form t' from formulas of the form t. 



The computational and logical viewpoints are not exclusive: they complement 
each other. This can be illustrated with a simple Petri net example. Indeed, Petri 
nets [81] provide some of the simplest examples of systems exhibiting concurrent 
change, and therefore it is interesting to see how the computational and logical 
viewpoints appear in them as two sides of the same coin. 

Usually, a Petri net is presented as a set of places, a disjoint set of transitions 
and a relation of causality between them that associates to each transition the 
set of resources consumed as well as produced by its firing. Ugo Montanari and 
I recast this idea in an algebraic framework in [68]. From this point of view, 
resources are represented as multisets of places, and therefore we have a binary 
operation (multiset union, denoted ® here) that is associative, commutative, 
and has the empty multiset as an identity^ but is not idempotent. Then, a Petri 
net is viewed as a graph whose arcs are the transitions and whose nodes are 
multisets over the set of places, usually called markings. 

The following Petri net represents a machine to buy cakes and apples; a cake 
costs a dollar and an apple three quarters. Due to an unfortunate design, the 
machine only accepts dollars, and it returns a quarter when the user buys an 
apple; to alleviate in part this problem, the machine can change four quarters 
into a dollar. 

As a graph, this net has the following arcs: 

buy-c : $ — > c 

buy-a : $ — > q 

change : q® q® q® q — > $ 



^ From now on the associativity, commutativity, and identity axioms are denoted by 
the acronym ACL 




Rewriting Logic and Maude: Concepts and Applications 



3 




The expression of this Petri net in rewriting logic is now obvious. We can view 
each of the labeled arcs of the Petri net as a rewrite rule in a rewrite theory having 
a binary associative, commutative operator ® (multiset union) with identity 1 
so that rewriting happens modulo ACI, that is, is multiset rewriting. Then, 
the concurrent computations of the Petri net from a given initial marking in 
fact coincide with the ACI-rewriting computations that can be performed from 
that marking using the above four rules. This is the obvious computational 
interpretation. 

But we can just as well adopt a logical interpretation in which the multiset 
union operator 0 can be viewed as a form or resource-conscious non-idempotent 
conjunction. Then, the state a®q®q corresponds to having an apple and a quar- 
ter and a quarter, which is a strictly better situation than having an apple and 
a quarter (non-idempotence of 0). Several researchers realized independently 
that this ACI operation on multisets corresponds to the conjunctive connective 
0 (tensor) in linear logic [3,45,56,57]. This complementary point of view sees a 
net as a theory in this fragment of linear logic. 

For example, in order to get the tensor theory corresponding to our Petri net 
above, it is enough to change the arrows in the graph presentation into turnstiles, 
getting the following axioms: 

buy-c : $ h c 

buy-a : $ h a 0 g 

change \ q®q®q®q'^% 

Rewriting logic faithfully supports the computational and logical interpre- 
tations, in the sense that a marking M' is reachable from a marking M by a 
concurrent computation of the above Petri net iff there is an ACI-rewriting com- 
putation M — > M' iff the above tensor theory can derive the sequent M h M' 
in linear logic (see [56,61]). 

The above example illustrates another important point, namely, that equa- 
tional logic, although surpassed by the rewriting logic interpretation of rewrite 
rules is nevertheless not abandoned. In fact, a rewrite theory 7?. is a 4-tuple 
72. = (S, E, L, R), where (S,E) is the equational theory modulo which we 
rewrite, T is a set of labels, and 7? is a set of labeled rules. In the above ex- 
ample E consists of the binary operator 0 and the constants a, c, q, and $, and 
E consists of the ACI axioms. 
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This means that, by identifying equational theories with rewrite theories such 
that the sets L and R are empty, equational logic can be naturally regarded as a 
sublogic of rewriting logic. In this way, the equational world is preserved intact 
within the broader semantic interpretation proposed by rewriting logic, both at 
the level of deduction and at the level of models. Note that rewriting techniques 
are applicable at two different levels of a rewrite theory TZ, namely, at the level 
of its equations E — which in general may not only contain axioms like ACI, but 
may for example contain Church-Rosser equations modulo such axioms — and at 
the level of the rewrite rules R which in general may not be Church-Rosser and 
may not terminate. 

The experience that we have gained so far strongly suggest that rewriting is 
indeed a very flexible and general formalism for both computational and logical 
applications. This means that from the computational point of view rewriting 
logic is a very expressive semantic framework, in which many different models 
of concurrency, languages, and distributed systems can be specified and pro- 
grammed; and that from a logical point of view is a general logical framework in 
which many different logics can be represented and implemented [58] . 

The goal of this paper is to introduce the main concepts of rewriting logic and 
of the Maude rewriting logic language that we have implemented at SRI [19,20], 
and to give a flavor for the many semantic framework and logical framework 
applications that rewriting logic makes possible, and that a language implemen- 
tation such as Maude can support by building tools and performing a variety 
of formal analyses on executable rewriting logic specifications. The paper does 
not intend to give a survey of research in rewriting logic, for which I refer the 
reader to the Workshop Proceedings [64,51] and to the survey [66]. In particular, 
I do not cover the research associated with two other important rewriting logic 
languages, namely, ELAN [52,10,9] and CafeOBJ [39]. 

The paper is organized as follows. Sections 2 and 3 introduce the main con- 
cepts of rewriting logic and Maude. Section 4 discusses some semantic framework 
applications, and Section 5 is dedicated to some logical framework applications. 

1 finish with some concluding remarks. 

2 Rewriting Logic 

This section introduces the basic concepts of rewriting logic, including its infer- 
ence rules, initial and free models, reflection, and executability issues. 



2.1 Inference Rules and their Meaning 

A signature in rewriting logic is an equational theory^ {E,E), where E is an 
equational signature and A is a set of A-equations. Rewriting will operate on 
equivalence classes of terms modulo E. 

^ Rewriting logic is parameterized by the choice of its underlying equational logic, 
that can be unsorted, many-sorted, order-sorted, membership equational logic, and 
so on. To ease the exposition I give an unsorted presentation. 
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Given a signature (S,E), sentences of rewriting logic are sequents of the 
form 

[t]E > [t']E, 

where t and t' are Z'-terms possibly involving some variables, and [tje denotes 
the equivalence class of the term t modulo the equations E. A rewrite theory 
7?. is a 4-tuple TZ = (Z, E, L, R) where Z is a ranked alphabet of function 
symbols, Z is a set of Z-equations, L is a set of labels, and i? is a set of pairs 
RC Lx Ts,e{X)‘^ whose first component is a label and whose second component 
is a pair of Z-equivalence classes of terms, with X = {x \, . . . , . . .} a countably 

infinite set of variables. Elements of R are called rewrite rules.^ We understand 
a rule (r, ([t], [t'])) as a labeled sequent and use for it the notation r : [t] — > [t'\. 
To indicate that {x\, . . . ,Xn} is the set of variables occurring in either t or 
t', we write r : [t{x\, . . . ,Xn)\ — [f {x\, . . . ,Xn)], or in abbreviated notation 
r : [t(x)] — > [t'(x)]. 

Given a rewrite theory TZ, we say that TZ entails a sentence [t] — > [t'], or 
that [t] — > [t'] is a (concurrent) TZ-rewrite, and write TZ h [t] — > [t'] if and 
only if [t] — > \t'] can be obtained by finite application of the following rules 
of deduction (where we assume that all the terms are well formed and t{w/x) 
denotes the simultaneous substitution of Wi for Xi in t) : 



1. Reflexivity. For each [t] G Ts,e{X), 

2. Congruence. For each / G Z„, n G IN, 



[E]^[t[] ... 

lf{ti,...,tn)] > [f{t[,...,t'J]' 

3. Replacement. For each rule r : [t{xi , . . . , x„)] — > [t'{xi, . . . , x„)] in R, 



[w'l] 






Wr^ 



\w' 



[t{w/x)] — > [t'{w'/x)] 



4. Transitivity 

[^l] ^ [^ 2 ] [t2] > [ts] 

[ii] ^ [is] 

Rewriting logic is a logic for reasoning correctly about concurrent systems 
having states, and evolving by means of transitions. The signature of a rewrite 
theory describes a particular structure for the states of a system — e.g., multiset, 
binary tree, etc. — so that its states can be distributed according to such a struc- 
ture. The rewrite rules in the theory describe which elementary local transitions 

® To simplify the exposition the rules of the logic are given for the case of unconditional 
rewrite rules. However, all the ideas presented here have been extended to conditional 
rules in [61] with very general rules of the form 

^ • W — ^ [t'] */ [wi] — ^ bi] A ... A [ufc] — > [ufc]. 



This increases considerably the expressive power of rewrite theories. 
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are possible in the distributed state by concurrent local transformations. The 
rules of rewriting logic allow us to reason correctly about which general con- 
current transitions are possible in a system satisfying such a description. Thus, 
computationally, each rewriting step is a parallel local transition in a concurrent 
system. 

Alternatively, however, we can adopt a logical viewpoint instead, and regard 
the rules of rewriting logic as metarules for correct deduction in a logical system. 
Logically, each rewriting step is a logical entailment in a formal system. 

2.2 Initial and EYee Models 

Equational logic enjoys very good model-theoretic properties. In particular, the 
existence of initial and free models, and of free extensions relative to theory 
interpretations are very useful and are at the basis of the initial algebra se- 
mantics of equational specifications and programming languages. All such good 
model-theoretic properties are preserved when equational logic is generalized to 
rewriting logic. For this, the notion of model itself must of course be generalized. 

I sketch the construction of initial and free models for a rewrite theory TZ = 
(U, E, L, R). Such models capture nicely the intuitive idea of a “rewrite system” 
in the sense that they are systems whose states are A-equivalence classes of 
terms, and whose transitions are concurrent rewritings using the rules in R. By 
adopting a logical instead of a computational perspective, we can alternatively 
view such models as “logical systems” in which formulas are validly rewritten 
to other formulas by concurrent rewritings which correspond to proofs for the 
logic in question. Such models have a natural category structure, with states 
(or formulas) as objects, transitions (or proofs) as morphisms, and sequential 
composition as morphism composition, and in them dynamic behavior exactly 
corresponds to deduction. 

Given a rewrite theory 72. = (A', E, L, R), for which we assume that different 
labels in L name different rules in R, the model that we are seeking is a cate- 
gory Ttz{X) whose objects are equivalence classes of terms [t] G Ts^e{X) and 
whose morphisms are equivalence classes of “proof terms” representing proofs in 
rewriting deduction, i.e., concurrent 72-rewrites. The rules for generating such 
proof terms, with the specification of their respective domains and codomains, 
are given below; they just “decorate” with proof terms the rules 1-4 of rewriting 
logic. Note that we always use “diagrammatic” notation for morphism compo- 
sition, i.e., a; l3 always means the composition of a followed by j3. 

1. Identities. For each [t] G Te^e(X), 

2. A-Structure. For each / G A„, n G 

ai ■■ [h] — > [t[] ... an - [tn] > [tn] 

/(«!,..., a„) : [f{ti,...,tn)] — > [/(7'i,...,4)]' 

3. Replacement. For each rewrite rule r : [7(T^)] — > [7'(ir")] in R, 

ai : [wi] — > [w(] ... : [w„] — > 

r(ai, . . . , a„) : [t{w/x)] — > \t'(w' /x)] 



[t]:[t]-^[ty 
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4 Composition ^ ^ ^ ^ 

4. (composition ^ ^ 

Each of the above rules of generation defines a different operation taking 
certain proof terms as arguments and returning a resulting proof term. In other 
words, proof terms form an algebraic structure Vn{X) consisting of a graph with 
nodes Ts,e{X), with identity arrows, and with operations / (for each / G S), 
r (for each rewrite rule), and (for composing arrows). Our desired model 
Ttz{X) is the quotient of Vn{X) modulo the following equations:^ 

1. Category 

(a) Associativity. ¥or &\\ a, (a; /3); 7 = a; (/3; 7). 

(b) Identities. For each a : [t] — > [t'], a; [t'] = a and [t];a = a. 

2. Ftmctoriality of the Z'- Algebraic Structure. For each / G 

(a) Preservation of composition. For all oi, . . . , a„, /3i, . . ., j3n, 

/(Oi ; Pi , . . .,0:71; Pn) = /(oi, . . .,0:77)5 f {Pi , ... , Pn) • 

(b) Preservation of identities, /([ti], ■ ■ ■ , [^ti]) = [f{ti, ■ • ■ , ^ti)]- 

3. Axioms in E. For t{xi, . . . ,Xn) = t' {xi, . . . ,Xn) an axiom in E, for all 

Oi, . . . , O77, t(oi, . . . , O77) — t (oi, . . . , O77). 

4. Exchange. For each r : [t{xi, . . ., Xn)] — *■ [t'{xi , . . . , Xn)] in R, 

Oi : [wi] — > [w(] ... 077 : [wn] — > [w^] 

r(o) = r([m]); t'{a) = t{a); rOm']) 

Note that the set X of variables is actually a parameter of these constructions, 
and we need not assume X to be fixed and countable. In particular, for A = 0, we 
adopt the notation Tjz. The equations in 1 make Tj^{X) a category, the equations 
in 2 make each / G Z a functor, and 3 forces the axioms E. The exchange law 
states that any rewriting of the form r(o) — which represents the simultaneous 
rewriting of the term at the top using rule r and “below,” i.e., in the subterms 
matched by the variables, using the rewrites o — is equivalent to the sequential 
composition r([m]); t' pa), corresponding to first rewriting on top with r and then 
below on the subterms matched by the variables with a, and is also equivalent 
to the sequential composition t{a); rOm']) corresponding to first rewriting below 
with a and then on top with r. Therefore, the exchange law states that rewriting 
at the top by means of rule r and rewriting “below” using a are processes that 
are independent of each other and can be done either simultaneously or in any 
order. 

Since each proof term is a description of a concurrent computation, what 
these equations provide is an equational theory of true coneurrency, allowing 
us to characterize when two such descriptions specify the same abstract compu- 
tation. From a logical viewpoint they provide a notion of abstract proof, where 
equivalent syntactic descriptions of the same proof object are identified. 

^ In the expressions appearing in the equations, when compositions of morphisms 
are involved, we always implicitly assume that the corresponding domains and 
codomains match. 
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The models 7^ and T'n,{X) are, respectively, the initial model and the free 
model on the set of generators X for a general category of models of a rewrite 
theory TZ\ each model interprets the rewrite rules in TZ as natural transformations 
in an underlying category endowed with algebraic structure [61]. 



2.3 Reflection 

Intuitively, a logic is reflective if it can represent its metalevel at the object 
level in a sound and coherent way. More precisely, Manuel Clavel and I have 
shown that rewriting logic logic is reflective [23,18,25] in the sense that there is 
a finitely presented rewrite theory U that is universal in the sense that for any 
finitely presented rewrite theory TZ (including U itself) we have the following 
equivalence 

TZ'rt^t' ^ 7/ h (^,t) (^,F), 

where TZ and t are terms representing TZ and t as data elements of U. Since U 
is representable in itself, we can achieve a “reflective tower” with an arbitrary 
number of levels of reflection, since we have 

^ U ^ ^ ^ UV- IJl, {lZ,f)) ^ {U, {%¥)) . . . 

Reflection is a very powerful property. It is systematically exploited in the 
Maude rewriting logic language implementation [19], that provides key features 
of the universal theory 7/ in a built-in module called META-LEVEL. In particular, 
META-LEVEL has sorts Term and Module, so that the representations t and TZ of 
a term t and a module TZ have sorts Term and Module, respectively. META-LEVEL 
has also functions meta-reduce(7?., t) and meta-apply(72., t, I, W, n) which re- 
turn, respectively, the representation of the reduced form of a term t using the 
equations in a module TZ, and the (representation of the) result of applying a 
rule labeled I in the module 7?. to a term t at the top with the (n -T l)th match 
consistent with the partial substitution a. As the universal theory U that it im- 
plements in a built-in fashion, META-LEVEL can also support a reflective tower 
with an arbitrary number of levels of reflection. 



2.4 Executability Issues 

How should rewrite theories be executed in practice? First of all, in a general 
rewrite theory TZ — (X, E, L, R) the equations E can be arbitrary, and therefore, 
E-equality may be undecidable. Faced with such a general problem, an equally 
general solution is to transform TZ into a rewrite theory TZ^ = {E, 0, E U Le, 7?U 
E U E~^) in which we view the equations E as rules from left to right (E) and 
from right to left (E~^), labeled by appropriate new labels Le- In this way, we 
reduce the problem of rewriting modulo E to the problem of standard rewriting, 
since we have the equivalence 

TZh[t]^ [t'] TZ^Lt-^t'. 
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In actual specification and programming practice we can do much better than 
this, because the equational theory {S, E) is typically decidable. For computa- 
tional applications this assumption is reasonable, because in the initial model 7^ 
the algebra Ts,e axiomatizes the state space of the concurrent system in ques- 
tion, which should be computable. For applications in which TZ is interpreted as 
a logic, the decidability of (77, E) is again the norm, since then Te,e is inter- 
preted as the set of formulas for the logic, which should be computable even if 
deducibility of one formula from another may be undecidable. 

An attractive and commonly occurring form for the decidable equational 
theory (77, E) is with E — E'U A, where A is a set of equational axioms for which 
we have a matching algorithm, and E' is a set of Church- Rosser and terminating 
equations modulo A. In these circumstances, a very attractive possibility is to 
transform TZ — (77,77' U A,L,R) into the theory TZ^ = {E,A,L U Lei,RZI 
E'). That is, we now view the equations E' as rules added to i?, labeled with 
appropriate new labels Le'- In this way, we reduce the problem, of rewriting 
modulo E to the much simpler problem of rewriting module A, for which, by 
assumption, we have a matching algorithm. 

The question is, of course, under which conditions is this transformation 
complete. That is, under which conditions do we have an equivalence 

TZ h [t]E [t']E IZ^ [t]A [i']A- 

The above equivalence can be guaranteed if TZ satisfies the following weak 
coherence condition, that generalizes the coherence condition originally proposed 
by Viry [90], and a similar condition in [62], Section 12.5.2.1, namely, whenever 
'R^ h [t]A ^ [t']A, we then also have TZ^ F canE',A{t) [i"]A for some t" such 
that canE',A{t') = cauE' ,A{t"), where canE',A{t) denotes the A-equivalence class 
of the canonical form of t when rewritten modulo A with the equations 77'. 

Weak coherence, and Viry’s coherence, express of course notions of relative 
confluence between the equations and the rules. Methods for checking or for 
achieving coherence, that generalize to the rewriting logic context similar tech- 
niques for equational completion modulo axioms have been proposed by Viry in 
several papers [90,89]. 

In actual executable specification and programming practice, one tends to 
write specifications TZ that are weakly coherent, or at least weakly ground co- 
herent. This happens because, given an equivalence classes [7 ]e the associated 
canonical form canE',A{t) typically is built up by constructors modulo A, and 
the patterns in the lefhand sides of the rules in R typically involve only such 
constructors. The operational semantics of Maude assumes that the specification 
TZ is weakly (ground) coherent, and therefore always reduces terms to canonical 
form with the equations modulo the given equational axioms A before any step 
of rewriting with the rules R. 

Even if we have a rewrite theory TZ that is weakly (ground) coherent, exe- 
cuting such a theory is a nontrivial matter for the following reasons: 
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1. TZ need not be confluent; 

2. TZ need not be terminating; 

3. some rules in TZ may have extra variables in their righhand sides and/or in 
their conditions. 

In fact, when we think of a rewrite theory as a means to specify a concurrent 
system, that can be highly nondeterministic, and that can be forever reacting to 
new events in its environment, assumptions such as confluence, which intuitively 
is a weak form of determinism, and termination cannot be expected to hold 
in general. Logical applications lead to similarly low expectations, since logical 
deduction can in general progress in many different directions. 

Therefore, even though some form of default execution^ may be quite useful 
in practice for some classes of applications, in general rewrite theories should be 
executed with a strategy, and such strategies may be quite different depending 
on the given application. Therefore, rewriting logic languages such as Maude 
and ELAN [52,10,9] support rewriting with strategies. 

Reflection can be exploited to define internal rewriting strategies [23,24,18], 
that is, strategies to guide the rewriting process whose semantics can be defined 
inside the logic by rewrite rules at the metalevel. In fact, there is great free- 
dom for defining many different strategy languages inside Maude. This can be 
done in a completely user-definable way, so that users are not limited by a fixed 
and closed strategy language. The idea is to use the operations meta-reduce 
and meta-apply as basic strategy expressions, and then to extend the mod- 
ule META-LEVEL by additional strategy expressions and corresponding semantic 
rules. A number of strategy languages that have been defined following this 
methodology can be found in [18,20]. 

3 Maude 

Maude [19,20] is a language and system developed at SRI International whose 
modules are theories in rewriting logic. The most general Maude modules are 
called system modules. They have the syntax mod TZ endm with TZ the rewrite 
theory in question, expressed with a syntax quite close to the corresponding 
mathematical notation®. The equations E in the equational theory {S, E) un- 
derlying the rewrite theory TZ = (A, E, L, R) are presented as a union E = E'ijA, 
with A a set of equational axioms introduced as attributes of certain operators 
in the signature E — for example, an operator -|- can be declared associative and 
commutative by keywords assoc and comm — and where E' is a set of equations 
that are assumed to be Church-Rosser and terminating modulo the axioms A. 
Furthermore, as already pointed out in Section 2.4, TZ is assumed to be weakly 
(ground) coherent. Maude supports rewriting modulo different combinations of 

® Maude supports a fair top-down default execution of this kind in which the user may 
also specify a bound on the number of rewrites. 

® See [191 for a detailed description of Maude’s syntax, which is quite similar to that 
of OBJ3 [43]. 
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such equational attributes A: operators can be declared with any combination of 
associative, commutative, and with left, right, or two-sided identity attributes. 

Since we can view an equational theory as a rewrite theory whose set L 
and R of labels and rules are both empty, Maude contains a sublanguage of 
functional modules of the form fmod {S, E) endfm, where, as before, E = E' U 
A, with E' Church- Rosser and terminating modulo A. The equational logic of 
functional modules and of the equational part of system modules is membership 
equational logic [65], an expressive logic whose atomic formulas are equations 
t = t' and membership predicates t : s with s a sort. Membership equational 
logic is quite expressive. It supports sorts, subsorts, operator overloading, and 
equational partiality. But this expressiveness is achieved while being efficiently 
executable by rewriting and having suitable techniques for completion and for 
equational theorem proving [11]. 

Maude has a third class of modules, namely, object-oriented modules that 
specify concurrent object-oriented systems and have syntax omod TZ endom, with 
TZ a sugared form for a rewrite theory. That is, object-oriented modules are in- 
ternally translated into ordinary system modules, but Maude provides a more 
convenient syntax for them, supporting concepts such as objects, messages, ob- 
ject classes, and multiple class inheritance. 

Modules in Maude have an initial semantics. Therefore, a system module 
mod TZ endm specifies the initial model Tjj, of the rewrite theory TZ. Similarly, a 
functional module fmod {E, E) endfm specifies the initial algebra Ts^e of the 
equational theory (E,E). 

In addition, like in OBJ3, Maude supports a module algebra with param- 
eterized modules, parameter theories (with “loose” semantics) views (that is, 
theory interpretations) and module expressions. In Maude, this module algebra, 
called Full Maude, is defined inside the language by reflection, and is easily ex- 
tensible with new module composition operations [37,35]. Such a module algebra 
is one important concrete application of the general metaprogramming capabil- 
ities made possible by Maude’s support of reflection through the META-LEVEL 
module. As already mentioned in Section 2.4, another important application of 
reflection is the capacity for defining internal strategy languages that can guide 
the execution of rewrite theories by means of rewrite rules at the metalevel. 

Even though the Maude system is an interpreter, its use of advanced semi- 
compilation techniques that compile each rewrite rule into matching and re- 
placement automata [38] make it a high-performance system that can reach up 
to 1.66 million rewrites per second in the free theory case (A = 0) and from 
130,000 to one million rewrites per second in the associative-commutative case 
on a 500 MHz Alpha for some applications. In addition, a Maude compiler cur- 
rently under development can reach up 13 million rewrites per second on the 
same hardware for some applications. 

This means that Maude can be used effectively not only for executable spec- 
ification purposes, but also for declarative programming. In addition, the design 
of a mobile language extension of Maude called Mobile Maude is currently under 
development [36]. 
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Executables for the Maude interpreter, a language manual, a tutorial, exam- 
ples and case studies, and relevant papers can be found in 
http : //maude . csl . sri . com. 



4 Semantic Framework Applications 

Semantic framework applications include both concurrency models and formal 
specification, analysis, and programming of communication systems. 



4.1 Concurrency Models 

From its beginning, one of the key applications of rewriting logic has been as 
a semantic framework that can unify a wide range of models of concurrency. 
The key idea is that rewriting logic is not yet another concurrency model; it 
is instead a logic in which different theories, specifying different concurrency 
models, can be defined. The initial models of such rewrite theories then provide 
“true concurrency” semantic models for each desired concurrency style. In fact, 
many different concurrency models have been formalized in this way within 
rewriting logic. Detailed surveys for many of these formalizations can be found 
in [61,63,66]. I list below some of the models that have been studied, giving 
appropriate references for the corresponding formalizations: 

— Concurrent Objects [62] 

— Actors [86,85,87] 

— Petri nets [61,82] 

— CSP (see Section 5.2) 

— CCS [67,59] 

— Parallel functional programming and the lambda calculus [61,53] 

— The TT-Calculus [91] 

— The UNITY language [61] 

— Dataflow [63] 

— Concurrent Graph Rewriting [63] 

— Neural Networks [63] 

— The Chemical Abstract Machine [61] 

— Real-time and hybrid systems [75,76] 

— Tile models [69,15] 

One important feature of these formalizations is that their initial models — 
which provide a general “true concurrency” semantics — or models closely related 
to such initial models, are often isomorphic to well-known true concurrency 
models. This has been shown for several variants of standard rewriting {E — 0) 
[28,53], for the lambda calculus [53], for Petri nets [61,29], for CCS [17], and for 
concurrent objects [70]. 
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4.2 Specifying, Analyzing, and Programming Communication 
Systems 

I review here the experience gained so far in specifying in Maude communication 
systems such as communication protocols, cryptographic protocols, active net- 
works algorithms, composable communication services, and distributed software 
architectures. I also explain how such specifications can be subjected to a flexible 
range of formal analysis techniques. In addition, Maude can be used not only to 
specify communication systems, but also to program them in the Mobile Maude 
language currently under development. 

The general idea is to have a series of increasingly stronger formal methods, 
to which a system specification is subjected. Only after less costly and “lighter” 
methods have been used, leading to a better understanding and to important 
improvements and corrections of the original specification, is it meaningful and 
worthwhile to invest effort on “heavier” and costlier methods. Maude and its 
theorem proving tools [21] can be used to support the following, increasingly 
stronger methods: 

1. Formal specification. This process results in a first formal model of the sys- 
tem, in which many ambiguities and hidden assumptions present in an in- 
formal specification are clarified. A rewriting logic specification provides a 
formal model in exactly this sense. 

2. Execution of the specification. Executable rewriting logic specifications can 
be used directly for simulation and debugging purposes, leading to increas- 
ingly better designs. Maude’s default interpreter can be used for this purpose. 

3. Model- checking analysis. Errors in highly distributed and nondeterministic 
systems not revealed by a particular execution can be found by a model- 
checking analysis that considers all behaviors of a system from an initial 
state, up to some level or condition. Maude’s metalevel strategies can be 
used to model check a specification this way. 

4. Narrowing analysis. By using symbolic expressions with logical variables, one 
can carry out a symbolic model-checking analysis in which all behaviors not 
from a single initial state, but from the possibly infinite set of states described 
by a symbolic expression are analyzed. A planned unification mechanism will 
allow Maude to perform this narrowing analysis in an efficient way. 

5. Formal Proof. For highly critical properties it is also possible to carry out 
a formal proof of correctness, which can be assisted by formal tools such as 
those in Maude’s formal environment [21]. 

We are still in an early phase in the task of applying rewriting logic to 
communication systems. However, in addition to the work on foundations, on 
models of concurrent computation, some recent research by different authors 
focusing specifically on this area seems quite promising. The paper [32] surveys 
some of these advances, which can be summarized s follows: 

— The paper [34] reports on joint work by researchers at Stanford and SRI 
with the group led by J.J. Garcfa-Luna at the Computer Communications 
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Research Group at University of California Santa Cruz in which we used 
Maude very early in the design of a new reliable broadcast protocol for Active 
Networks. In this work, we have developed precise executable specifications of 
the new protocol and, by analyzing it through execution and model-checking 
techniques, we have found many deadlocks and inconsistencies, and have 
clarified incomplete or unspecified assumptions about its behavior. 

— Maude has also been applied to the specification and analysis of crypto- 
graphic protocols [30] , showing how reflective model-checking techniques can 
be used to discover attacks. 

— The positive experience with security protocols has led to the adoption of 
Maude by J. Millen and G. Denker at SRI as the basis for giving a formal 
semantics to their new secure protocol specification language CAPSL and as 
the meta-tool used to endow CAPSL with an execution and formal analysis 
environment [33]. 

— The paper [92] reports joint work with Y. Wang and C. Gunter at the Uni- 
versity of Pennsylvania in using Maude to formally specify and analyze a 
PLAN [47] active network algorithm. 

— The paper [31] presents an executable specification of a general middle- 
ware architecture for composable distributed communication services such 
as fault-tolerance, security, and so on, that can be composed and can be dy- 
namically added to selected subsets of a distributed communications system. 

— In [19] (Appendix E) a substantial case study showing how Maude can 
be used to execute very high level software designs, namely architectural 
descriptions, is presented. It focuses on a difficult case, namely, heteroge- 
neous architectures illustrated by a command and control example featuring 
dataflow, message passing, and implicit invocation sub-architectures. Using 
Maude, each of the different subarchitectures can not only be executed, but 
they can also be interoperated in the execution of the resulting overall sys- 
tem. 

— As part of a project to represent the Wright architecture description language 
[1] in Maude, Nodelman an Talcott have developed a representation of CSP 
in Maude. This is compatible with existing tools for analyzing CSP specifi- 
cations, complements them by providing a rich execution environment and 
the ability to analyze non-flnite state specifications, and provides a means of 
combining CSP specifications with other notations for specifying concurrent 
systems. 

— Najm and Stefani have used rewriting logic to specify computational models 
for open distributed systems [73]. 

— Talcott has used rewriting logic to define a very general model of open dis- 
tributed components [85]. 

— Pita and Martf-Oliet have used the reflective features of Maude to specify 
the management process of broadband telecommunication networks [78,79]. 

— Nakajima has used rewriting logic to give semantics to the calculus of mobile 
ambients and to specify a Java/ORB implementation of a network manage- 
ment system [74]. 
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— Wirsing and Knapp have defined a systematic translation from object-orien- 
ted design notations for distributed object systems to Maude specifications 
[93]. 



As already mentioned, Maude can be used not only for specifying communi- 
cation systems, but also for programming them. At SRI, F. Duran, S. Eker, P. 
Lincoln and I are currently advancing the design of Mobile Maude [36] . This is 
an extension of Maude supporting mobile computation that uses refiection in a 
systematic way to obtain a simple and general declarative mobile language de- 
sign. The two key notions are processes and mobile objects. Processes are located 
computational environments where mobile objects can reside. Mobile objects 
can move between different process in different locations, and can communi- 
cate asynchronously with each other by means of messages. Each mobile object 
contains its own code — that is a rewrite theory TZ — metarepresented as a term 
TZ. In this way, refiection endows mobile objects with powerful “higher-order” 
capabilities whithin a simple first-order framework. 

We expect that Mobile Maude will have good support for secure mobile com- 
putation for two reasons. Firstly, mobile objects will communicate with each 
other and will move from one location to another using state-of-the-art encryp- 
tion mechanisms. Secondly, because of the logical basis of Mobile Maude, we 
expect to be able to prove critical properties of applications developed in it with 
much less effort than what it would be required if the same applications were 
developed in a conventional language such as Java. 

5 Logical Framework Applications 

When we look at rewriting logic from the logical point of view, it becomes a 
logical framework in which many other logics can be naturally represented [58] . 
Furthermore, refiection gives it particularly powerful representational powers, so 
that Maude can be used as a formal meta-tool to build many other formal tools 
[ 22 ]. 



5.1 A Reflective Logical Framework 

The basic reason why rewriting logic can easily represent many other logics is 
that the syntax of a logic can typically be defined as an algebraic data type 
of formulas, satisfying perhaps some equations, such as associative and com- 
mutativity of logical operators like conjunction and disjunctions, or equations 
for explicit substitution to equationally axiomatize quantifiers and other bind- 
ing operators. That is, formulas can typically be expressed as elements of the 
initial algebra of a suitable equational theory. Then, the typical inference rules 
of a logic are nothing but rewrite rules that rewrite formulas, or proof-theoretic 
structures such as sequents, in the deduction process; if an inference rule has side 
conditions, then the corresponding rewrite rule is a conditional rule. Therefore, 
we can typically represent inference in a logic — or in a theory within a logic in 
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the case when different signatures are possible — by means of a rewrite theory. 
Furthermore, such a representation is usually very natural, because it mirrors 
very closely both the syntax and the inference rules of the original logic. 

The representation of a logic C into rewriting logic can therefore be under- 
stood as a mapping 

'P : C — > RW Logic. 

This suggest using the theory of general logics [60] to define the space of logics 
as a category, in which the objects are the different logics, and the morphisms 
are the different mappings translating one logic into another. In general, we can 
axiomatize a translation O from a logic T to a logic C as a morphism 

e-.c — > 

in the category of logics. A logical framework is then a logic tF such that a very 
wide class of logics can be mapped to it by maps of logics 

<P:C — 

called representation maps, that have particularly good properties such as con- 
servativity^. By choosing T — RWLogic we explore the use of rewriting logic as 
a logical framework. 

One reason why refiection makes rewriting logic particularly powerful as a 
logical framework is that maps between logics can be reified and executed within 
rewriting logic. We can do so by extending the universal theory U with equational 
abstract data type definitions for the data type of theories Module c for each logic 
C of interest. Then, a map Theta : C — > C can be reified as an equationally- 
defined function 

: Module c — > Modulec- 

Similarly, a representation map : C — > RW Logic can be reified by a function 

T : Modulec — ^ Module. 

If the maps dh and are computable, then, by a metatheorem of Bergstra and 
Tucker [7] it is possible to define the functions <P and T by means of corresponding 
finite sets of Church-Rosser and terminating equations. That is, such functions 
can be effectively defined and executed within rewriting logic. 

In summary, mappings between logics, including maps representing other 
logics in rewriting logic, can be internalized and executed within rewriting logic, 
as indicated in the picture below. 

There is yet another reason why rewriting logic refiection is very important 
for logical framework applications. By refiection rewriting logic can not only be 
used as a logical framework in which the deduction of a logic C can be faithfully 

^ A map of logics is conservative [60] if the translation of a sentence is a theorem if 
and only if the sentence was a theorem in the original logic. Conservative maps are 
sometimes said to be adequate and faithful by other autors. 
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simulated, but also as a meta-logical framework in which we can reason about 
the metalogical properties of a logic C. David Basin, Manuel Clavel and I have 
begun studying the use of the Maude inductive theorem prover (see Section 5.2) 
enriched with reflective reasoning principles to prove such metalogical properties 
[5,6]. 



5.2 Formal Meta-tool Applications 

All this suggest using Maude as a formal meta-tool [22] to build other formal 
tools such as theorem provers or translators between logics. Furthermore, such 
tools can be built from executable specifications of the inference systems and the 
mappings themselves. The reflective features of Maude, besides making this pos- 
sible, also allow building entire environments for such formal tools. This can be 
achieved by using the general meta-parsing and meta-pretty printing functions 
of the META-LEVEL module, and the LOOP-MODE module, that provides a general 
read-eval-print loop that can be customized with appropriate rewrite rules to 
define the interaction with the user for each tool. Our experience suggest that 
it is much easier to build and maintain formal tools this way than with con- 
ventional implementation techniques. Also, because of the high performance of 
the Maude engine, the tools generated this way often have quite reasonable per- 
formance. Of course, special-purpose algorithms can be needed for performance 
reasons as components of a specific tool, but this is not excluded by our general 
methodology. 

The paper [22] gives a detailed account of different formal tools, including the- 
orem proving tool, execution and analysis environments for different formalisms, 
a module algebra, and logic traslators, that have been defined in Maude by dif- 
ferent authors using the above reflective methodology. I summarize below our 
experience so far: 




18 



Jose Meseguer 



An Inductive Theorem Prover. Using the reflective features of Maude, we 
have built an inductive theorem prover for equational logic specifications [21] 
that can be used to prove inductive properties of both CafeOB J speciflcations 
[39] and of functional modules in Maude. As already mentioned, this tool can 
be extended with reflective reasoning principles to reason about the meta- 
logical properties of a logic represented in rewriting logic or, more generally, 
to prove metalevel properties [6] . 

A Church- Rosser Checker. We have also built a Church-Rosser checker tool 
[21] that analyzes equational speciflcations to check whether they satisfy the 
Church-Rosser property. This tool can be used to analyze order-sorted [42] 
equational speciflcations in CafeOBJ and in Maude. The tool outputs a col- 
lection of proof obligations that can be used to either modify the specification 
or to prove them. Extensions of this tool to perform equational completion 
and to check coherence of rewrite theories are currently under development. 

Fhll Maude. Maude has been extended with special syntax for object-oriented 
speciflcations, and with a rich module algebra of parameterized modules and 
module composition in the Clear/OBJ style [16,43] giving rise to the Full 
Maude language. All of Full Maude has been formally specified in Maude 
using reflection [37,35]. This formal specification — about 7,000 lines — is in 
fact its implementation, which is part of the Maude distribution. Our expe- 
rience in this regard is very encouraging in several respects. Firstly, because 
of how quickly it was possible to develop Full Maude. Secondly, because of 
how easy it will be to maintain it, modify it, and extend it with new fea- 
tures and new module operations. Thirdly, because of the competitive perfor- 
mance with which it can carry out complex module composition and module 
transformation operations, that makes the interaction with Full Maude quite 
reasonable. 

A Proof Assistant for OCC. Coquand and Huet’s calculus of constructions 
[27] CC, provides higher-order (dependent) types, but it is based on a fixed 
notion of computation, namely /3-reduction, which is quite restrictive in prac- 
tice. This situation has been addressed by addition of inductive definitions 
[77] [54] and algebraic extensions in the style of abstract data type systems 
[8] . Also, the idea of overcoming these limitations using some combination of 
membership equational logic with the calculus of constructions has been sug- 
gested as a long-term goal in [50] . Using the general results on the mapping 
of pure type systems to rewriting logic (see the translation PTS RWLogic 
below) Mark-Oliver Stehr is currently investigating, and has built a proof 
assitant for, the open calculus of constructions (OCC) an equational variant 
of the calculus of constructions with an open computational system and a 
flexible universe hierarchy. 

Real-Time Maude. Based on a notion of real-time rewrite theory that can nat- 
urally represent many existing models of real-time and hybrid systems, and 
that has a straightforward translation into an ordinary rewrite theory [76], 
Peter Olveczky and I are developing an execution and analysis environment 
for speciflcations of real-time and hybrid systems called Real-Time Maude. 
This tool translates real-time rewrite theories into Maude modules and can 
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execute and analyze such theories by means of a library of strategies that 
can be easily extended by the user to perform other kinds of formal analysis. 

Maude Action Tool. Action semantics [71] is a formal framework for speci- 
fying the semantics of programming languages in a modular and readable 
way. Modular structural operational semantics (MSOS) is also a modular 
formalism for SOS definitions [72] that in particular can give an operational 
semantics to action notation preserving its modularity features. Christiano 
Braga, Hermann Haeusler, Peter Mosses, and I are currently developing a 
tool in Maude for the execution and analysis of programming language def- 
initions written either in action notation or in MSOS notation [12]. 

A CCS Execution and Verification Environment. Using Maude, Alberto 
Verdejo and Narciso Marti-Oliet have built a flexible execution environment 
for CSS based on CCS’ operational semantics, or on extensions of such a 
semantics to traces of actions or to the weak transition relation. In this 
environment, they can perform a variety of formal analyses using strategies, 
and they can model check a CCS process with respect to a modal formula 
in the Hennessy-Milner logic [88]. 

HOL Nuprl. The HOL theorem proving system [44] has a rich library of 
theories that can save a lot of effort by not having to specify from scratch 
many commonly encountered theories. Howe [49] defined a model-theoretic 
map from the HOL logic into the logic of Nuprl [26] , and implemented such 
a map to make possible the translation from HOL theories to Nuprl theories. 
However, the translation itself was carried out by conventional means, and 
therefore was not in a form suitable for metalogical analysis. Mark-Oliver 
Stehr and I have recently formally defined in Maude an executable formal 
specification of a proof-theoretic mapping that translates HOL theories into 
Nuprl theories. Large HOL libraries have already been translated into Nuprl 
this way. Furthermore, in collaboration with Pavel Naumov, an abstract ver- 
sion of this mapping has been proved correct in the categorical framework 
of general logics and the mapping itself has been used to translate in a sys- 
tematic way HOL proofs into Nuprl proofs [84] . 

LinLogic — > RWLogic. Narciso Martf-Oliet and I defined two simple map- 
pings from linear logic [41] to rewriting logic: one for its propositional frag- 
ment, and another for first-order linear logic [58]. In addition, they explained 
how — using the fact that rewriting logic is reflective — these mappings could 
be specified and executed in Maude, thus endowing linear logic with an 
executable environment. Based on these ideas, Manuel Clavel and Narciso 
Martf-Oliet have specified in Maude the mapping from propositional linear 
logic to rewriting logic [18]. 

Wright -> CSP ^ RWLogic. Architectural description languages (ADLs) can 
be useful in the early phases of software design, maintenance, and evolution. 
Furthermore, if architectural descriptions can be subjected to formal analy- 
sis, design flaws and inconsistencies can be detected quite early in the design 
process. The Wright language [2] is an ADL with the attractive feature of 
having a formal semantics based on CSP [48]. Uri Nodelman, and Carolyn 
Talcott have recently developed in Maude a prototype executable environ- 
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merit for Wright using two mappings. The first mapping gives an executable 
formal specification of the CSP semantics of Wright, that is, it associates to 
each Wright architectural description a CSP process. The second mapping 
gives an executable rewriting logic semantics to CSP itself. The composition 
of both mappings provides a prototype executable environment for Wright, 
which can be used — in conjunction with appropriate rewrite strategies — to 
both animate Wright architectural descriptions, and to submit such descrip- 
tions to different forms of formal analysis. 

PTS ^ RWLogic. Pure type systems (PTS) [4] generalize the A-cube [4], which 
already contains important higher-order systems, like the simply typed and 
the (higher-order) polymorphic lambda calculi, a system AP close to the log- 
ical framework LF [46], and their combination, the calculus of constructions 
CC [27]. In [83] Mark-Oliver Stehr and I show how the definition of PTS 
systems can be easily formalized in membership equational logic and define 
uniform pure type systems (UPTS) a more concrete variant of PTS systems 
that do not abstract from the treatment of names, but use a uniform notion of 
names based on CINNI, a new first-order calculus of names and substitutions. 
UPTS systems solve the problem of closure under a-conversion [80] [55] in a 
very elegant way. Furthermore, [83] descibes how meta-operational aspects 
of UPTS systems, like type checking and type inference, can be formalized 
in rewriting logic. 

Tile Logic — > RWLogic. Tile logic is a flexible formalism for the specification 
of synchronous concurrent systems [40]. Robero Bruni, Ugo Montanari, and 
I have defined a mapping from Tile Logic to Rewriting Logic [14,15] that 
relates the semantic models of both formalisms. This mapping has then 
been used as a basis for executing tile logic specifications in Maude using 
appropriate strategies [13]. 

6 Conclusions 

I have introduced the main concepts of rewriting logic and Maude, and have 
given many examples of how they can be used in a wide range of semantic 
framework and logical framework applications. I hope to have given enough evi- 
dence to suggest that rewriting techniques can be fruitfully extended an applied 
beyond the equational logic world in the broader semantic context of rewriting 
logic. There are indeed many theoretical and practical questions awaiting to be 
investigated. 
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Abstract. This tutorial is devoted to tree automata. We will present 
some of the most fruitful applications of tree automata in rewriting the- 
ory and we will give an outline of the current state of research on tree 
automata. We give here just a sketch of the presentation. The reader can 
also refer to the on-line book “Tree Automata and Their Applications” 
[CDG+97]. 

1 Introduction 

Tree Automata theory and term rewriting theory are strongly connected [Dau94, 
Ott99]. On one side, tree automata can be viewed as a subclass of ground 
term rewrite systems [Bra69a, Bra69b]. On the other hand, tree automata have 
been used successfully as decision tools in rewriting theory. In this tutorial, we 
will present some of the most fruitful applications of tree automata in rewrit- 
ing theory and we point out some promising research directions in this area. 
For definitions and properties of tree automata the reader can refer to [GS96, 
CDG"''97]. Most of the results we mention here and more references can be found 
in [GDG+97]. 

2 Classical Tree Automata and Rewrite Systems 

If you want to use tree automata in rewriting theory, the ideal situation occurs 
when the reducibility relation is recognizable: a binary relation is said recogniz- 
able Iff the set of its encodings is a recognizable tree language; a couple is just 
encoded by overlapping its two terms: e.g. the [f(a,b), f(f(a,a,),a)] will be encoded 
into [f,f] ([a,f]([T,a],[T,a]),[b,a]). E.g., reducibility relations are recognizable for 
ground rewrite systems (more generally for linear term rewriting system such 
that left and right members of the rules do not share variables). Now, let us 
consider the following logical theory: the set of formulas is the set of all first- 
order formulas using no function symbols and a single binary predicate symbol, 
the predicate symbol is interpreted as the reducibility relation associated with 
a given rewrite system. When the reducibility relation is recognizable, you get 
easily the decidability of the theory, thanks to the good closure and decision 
properties of tree automata; this implies decidability of any property expressible 
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in this theory, like confluence. Furthermore, under some counditions, you can 
enrich the theory for expressing termination properties [DT90] . 

Clearly, recognizability of the reducibility relation is a very strong property 
restricted to limited subclasses of rewrite systems. Now, you can just require that 
the reducibility relation preserves recognizability, i.e. that the set of descendants 
(resp. the ancestors) of a recognizable tree language is recognizable. (Let us note 
that this is not the case even for linear systems). For example, reachability can 
then be easily reduced to membership to recognizable tree language. If preserva- 
tion of regularity is undecidable [GT95] , some conditions ensure it and it leads to 
decidability of reachability and joinability for some subclasses of term rewriting 
systems. ([Sal88, CDGV94, Jac96, FJSV98, NT99]). 

You can also require recognizability of the language of ground normal forms: 
it provides for example a very simple procedure for testing the ground reducibil- 
ity of a term. Clearly, the set of ground normal forms is a regular tree language for 
left-linear rewrite systems. Moreover, recognizability of the set of normal forms 
has been proven decidable [VG92, Kou92]. Finally, recognizability of the set of 
normalizable terms (it’s clearly ensured when the set of ground normal forms is 
recognizable and the inverse reducibilty relation preseves recognizability) ensures 
decidability of the sequentiality of the system, when left-linear [Com95] . 

All the previous approaches provide good decision procedures but only for 
very restricted classes of rewrite systems. If you are interested in one particu- 
lar rewrite system, you can also try to prove “experimentally” its good behavior 
w.r.t. to recognizability. E.g. J. Waldmann has proven by computer the recogniz- 
ability of the set of normalizing S-terms [Wal98]. Some pointers to software for 
manipulating tree regular expressions and automata can be found in [CDG“'"97]. 
But a question rises: How far can we go in using tree automata when describing 
properties of term rewriting systems? Can we And new “interesting” classes of 
t.r.s. with good properties w.r.t. recognizability? 



3 How to Go beyond the Limits of Usual Tree Automata? 

A way of going beyond the limits of the previous approaches is to consider 
approximation of rewrite systems. For example, Comon and Jacquemard study 
by these means reduction strategies and sequentiality [Com95, Jac96]. More 
recently, T. Genet and F. Klay compute regular over-approximations of the set 
of the descendants [Gen98] and use them for the verification of cryptographic 
protocols [GKOO]. 

But to go beyond the limits of tree automata, you can also use extensions 
of tree automata. The idea is roughly to enrich the notion of tree automata for 
dealing with the non-linearity while keeping good closure and decision properties. 
Several classes have been defined in this view and have been sucessfully applied 
in term rewriting theory. For example, the reduction automata provide decision 
procedures for emptiness and flniteness of the language of ground normal forms 
for every term rewriting system and they give a new procedure for testing ground 
reducibility [Pla85, DCC95, CJ97, CJ94]. Tree automata with tests between 
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brothers which have good decision properties allow also to get some new results 
for non-linear rewrite systems [CSTT99, STT99]. Let us finally cite also the 
powerful notion of tree t-uple synchronized grammars which has been used in 
unification theory end rewriting theory [LR98, LR99] . 

Of course, you can combine these two last approaches, e.g. by approximating 
the set of the descendants by extended recognizable tree languages. This opens 
new prospects and will require design of software for dealing with extended tree 
automata. 
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Abstract. This paper presents a system for explicit substitutions in 
Pure Type Systems (PTS). The system allows to solve type checking, 
type inhabitation, higher-order unification, and type inference for PTS 
using purely first-order machinery. A novel feature of our system is that 
it combines substitutions and variable declarations. This allows as a side- 
effect to type check let-bindings. Our treatment of meta-variables is also 
explicit, such that instantiations of meta-variables is internalized in the 
calculus. This produces a confluent A-calculus with distinguished holes 
and explicit substitutions that is insensitive to a-conversion, and allows 
directly embedding the system into rewriting logic. 



1 Introduction 

Explicit substitutions provide a convenient framework for encoding higher-order 
typed A-calculus using first-order machinery. In particular, this allows to in- 
tegrate higher-order unification with first-order provers, rewriting logic, and to 
delay evaluation and resolve scoping when type checking dependent-typed terms. 
On the other hand, several problems related to type theory, such as type check- 
ing (with definitions), type inference, checking equality of well typed terms, 
proof-term refinement, and the inhabitation problem, can be solved using the 
same machinery, once it is properly developed. We therefore here combine ex- 
plicit substitutions, variable declarations as explicit substitutions, and explicit 
instantiation of meta-variables using first-order rewrite rules. The combination 
is formulated for pure type systems, and applies therefore for arbitrary type sys- 
tems as those of the A-cube. A higher-order unification procedure for systems in 
the A-cube is a particular payoff. 

It is well-known that definitions, i.e., let-in expressions, are problematic in 
dependent- type systems [24, 4]. Two approaches have been used to extend the 
A-calculus with definitions. Severi and Poll [24] consider definitions as terms and 
extend the reduction relation to unfold definitions during the typing process. 
Bloo et al [4] do not extend the syntax of terms (although they use a different 
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notation for terms called item notation), but they consider definitions as part 
of the contexts. Combining the two approaches, Bloo [3] proposes a calculus of 
explicit substitutions where substitutions are also part of the contexts. In Bloo’s 
system, a-conversion remains as an implicit rule and non-explicit substitutions 
are still required to unfold definitions on types. The unfolding of definitions on 
types is inconvenient when they contain meta- variables. In this paper, we propose 
a system where context are special case of explicit substitutions, we can safely 
extend our calculus with definitions and meta-variables, and yet be completely 
insensitive to a-conversion. 

Finally, our extended calculus of explicit substitutions, where instantiations 
are also explicit, realizes the spirit of delayed evaluation to also cover instanti- 
ation of meta-variables. Indeed, our calculus can be considered a A-calculus of 
contexts^ i.e., a A-calculus with place-holders where the mechanism of filling in 
holes is explicit. 

The rest of this paper is organized as follows. We continue with a short review 
of pure type systems and explicit substitutions. More detailed descriptions can be 
found in [2] and [1]. Section 2 introduces the system PTS£, which is a system 
of pure types with explicit substitutions. Section 3 presents two novel aspects 
of PTS£ with respect to other proposals: definitions and explicit instantiations. 
Section 4 summarizes a meta-theoretical investigation, and Section 5 applies the 
system to higher-order unification and related typing problems. 

Pure Type Systems: Pure Type Systems [14, 2] is a formalism to describe 
a family of type systems which generalizes the cube of typed lambda calculi. 
A Pure Type System, PTS for short, is a typed A-calculus given by a triple 
(5, A, 7^), where 5 is a base set of sorts, A C 5 x 5 is a set of axioms, and 
CiSxiSxiSisa set of rules. A sort si is a top sort if it does not exist a sort 
S 2 such that (si, S 2 ) G A. 

The set of PTS (pseudo- )terms is formed by variables x,y, . . ., applications: 
(M N), abstractions: Xx:A.M, products: lTx:A.B, and sorts s G S. Abstractions 
and products are binding structures. As usual, in higher-order formal systems, 
terms are considered equivalents modulo a-conversion, i.e., renaming of bound 
variables. Notice that there is no syntactical distinction between terms denoting 
objects and terms denoting types. For sake of readability, we use the uppercase 
Roman letters A, R, ... to range over terms denoting types. The notation A — > R 
is used for IIx'.A.B when x does not appear free in B. 

A PTS type judgment has the form T \- M : A where T is a typing context, 
i.e., a list x\ : Ai, . . . , Xn ■ A„ of type assignments for variables, where n > 0 and 
Xi yf Xj for i ^ j. We use the Greek letters T, A to range over typing contexts. 
A variable x is fresh (in T) if a; yf Xi for 1 < i < n. The typing rules of the 
PTS defined by (S,A,TZ) are shown in Figure 1. In rules (Start) and (Weak), 
we assume that x is fresh in T. 



^ In a typed calculus, the word context has two meanings: variable declaration and 
expression with a distinguished hole. When confusion may arise, we will write typing 
context in the former case and expression context in the latter one. 
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r\- A: s 
r,x : A X : A 



(Start) 



r\- A: s r\- M : B 
r,x : Ah M ■. B 



(Weak) 



r A : Si r, x : A B : S2 

(ai, ^2, as) £ 

r h Tlx’.A.B : S3 



(Prod) 



r,x : Ah M : B 
r h IJx'.A.B : s 
r h \x:A.M : Bx-.A.B 



(Abs 



Bh M : Bx-.A.B 
Bh N ■ A . 

Bh (M N) : B[N/x\ wPPb 



Bh M -. A Bh B -. s 
A = B 
Bh M : B 



(Conv) 



Fig. 1. PTS typing rules 

We here consider the extension of PTS with the ? 7 -equality, i.e., the rela- 
tion = in rule (Conv) is the congruence relation induced by the rewrites /3: 

{Xx'.A.M N) ► M[N/x] and 77 : Xy:A.{M y) >- M where y is not free in 

M. Recall that M[N/x] denotes atomic substitution of the free occurrences of x 
in M by N, with renaming of bound variables in M when necessary. 

A PTS is functional if (1) (s, si), (s, S2) G A implies si = S2 and (2) 
(s, s', si), (s, s', S 2 ) G TZ implies si = S 2 - The cube of type systems (Barendregt’s 
cube) [2] are the PTS such that S = {*, □}, A = {(*, □)}, (*, *, *) G TZ, and for 
(si, S2, S3) in TZ, it holds S2 = S3. Well-known type systems of Barendregt’s cube: 
the simply-typed A-calculus, system F, LF, and the calculus of constructions, are 
all functional. For example, the simply-typed A-calculus has TZ = {(*,*,*)}. 

Explicit Substitutions: The Acr-calculus [1] is a first-order rewrite system with 
two sorts of expressions: terms and substitutions. By using substitutions as first- 
class objects and de Bruijn indices notation for variables, the Acr-calculus allows 
a first-order encoding of the A-calculus. In consequence, technical nuisances due 
to higher-order aspects of the A-calculus can be minimized or eliminated (e.g., 
a-conversion) in explicit substitution calculi. 

The rewrite system of Act includes a surjective-pairing rule (SCons): 1_[S'] • 

(t o S) ► S. Rule (SCons) is responsible of confluence and typing problems 

in Act [9, 20]. These problems are overcame in a variant of Act, called Xc [20]. 
The A^-calculus has the same general features as Act, i.e., simple, finite, and 
first-order presentation, but it does not contain rule (SCons). 

Expressions of the untyped version of Xc are terms: 1_, applications (M N), 
abstractions AM, and closures M[S]; and explicit substitutions: M • S, and 

S oT; where M, N range over terms, S, T range over substitutions, and n ranges 
over natural numbers constructed with 0 and n -I- 1. The A^ -calculus is given 
in Figure 2. Free and bound variables are represented by de Bruijn indices. 
They are encoded by means of the constant 1_ and the substitution We 
overload the notation n to represent the A^-term corresponding to the index n, 
i.e., n -|- 1 = l[t”]- The occurrence of an index z in a term M is free when that 
occurrence is bound by j A-abstractions and j < i. By convenience, we write 
free indices to mean free occurrences of indices in a given term. 

An explicit substitution denotes a mapping from indices to terms. Thus, 
maps each index z to the term i + n , S oT is the composition of the mapping 
denoted by T with the mapping denoted by S (notice that the composition of 
substitution follows a reverse order with respect to the usual notation of function 
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composition), and finally, M ■ S maps the index 1 to the term M, and recursively, 
the index z + 1 to the term mapped by the substitution S on the index z. 

The A^-calculus, just as Act, uses the composition operation to achieve con- 
fluence on the calculus of substitutions (the calculus without (Beta)). ^ 



(AM N) 


* 


M[N ■ T“] 


(Beta) 


(M ■ S)oT 


* 


M[T] ■ {So T) 


(Map) 


(AM)[S] 


» 


AM[1 • (So T^)] 


(Lambda) 


foS 


* 


S 


(IdS) 


(M ZV)[S] 


^ 


(M[S] N[S]) 


(Application.) 


T"+i o (M • S) 


^ 


T" o S 


(ShiftCons) 


M[S] [T] 


^ 


M[SoT] 


(Clos) 


|"+i o p" 


^ 




(ShiftShift) 


1 [M • S] 


> 


M 


(VarCons) 


1 . 


> 


'I' 0 


(ShiftO) 


m[t“] 


^ 


M 


(Id) 


lit”"] ■ '1'”'"*'^ 


> 




(Shifts) 



Fig. 2. The A£ -rewrite system 

2 Pure Type Systems with Explicit Substitutions 

In this section, we present an explicit substitution A-calculus for PTS, namely 
PTS£. The main features of PTS^ are: a first-order setting insensitive to a- 
conversion, typing contexts as explicit substitutions, and support for expression 
contexts. As previously pointed out, higher-order aspects of the A-calculus, in- 
cluding a-conversion, may be handled in a first-order setting via explicit sub- 
stitutions and de Bruijn indices. We use the A^-calculus as the base calculus of 
PTS£. 

The PTS^-System: As in the case of PTS, a PTS^ is defined by a triple 
(5, A, TZ) of sorts, axioms, and rules. The grammar of well-formed PTS^ (pseudo- 
) expressions extends the one of Xc with sorts s, meta- variables A, type annotated 
abstractions Xa-M, products IIa-B, and type annotated substitutions M -aS. In 
PTS£, meta- variables, as well as substitutions, are first-class objects. However, 
only meta-variables on the sort of terms are allowed. We assume a set V of 
meta-variables. This set will be precisely defined in Section 3. 

An expression is ground if it does not contain meta-variables. A ground ex- 
pression is also pure if it does not contain other explicit substitutions than those 
appearing in the terms denoting de Bruijn indices (i.e., in terms of the form z). 

We define the ATT^-rewrite system as the extension to Xc with products and 
sorts given by Figure 3. The system Uc is obtained by dropping rule (Beta) 
from XIIc- 



{\a.M N) 


. 


m[n-a T“] 


(M -A S)oT 


* 


M[T] - A (SoT) 


(Aa.M)[S] 


> 


\a[s]-M\1-a (SoT^)] 


foS 


^ 


S 


(77^.B)[S] 


> 


nA[s]-B\i-A (SoT^)] 


T"+1o(M-a S) 


^ 


T"oS 


(M AT)[S] 


^ 


(M[S] N[S]) 




^ 




M[S][T] 


^ 


M[SoT] 


1 -A T" 


^ 


'I' 0 


1[M -A S] 


> 


M 


i[T"] -A T"+" 


> 




m[t“] 


* 


M 


dS] 


> 


S 



Fig. 3. The A7Ti;-rewrite system 

^ The Acr-calculus is not confluent on general open expressions [9]. However, it is 
confluent on semi-open expressions [23]. The Acr.f|.-calculus, a variant of Act, achieves 
confluence on general open expressions [9]. 
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Lemma 1. The Ilc-calculus is terminating. 

Proof. See [20] . The proof, due to Zantema, uses the semantic labeling technique 
[27]. □ 

The set of TT^-normal forms of an expression x (term or substitution) is 
denoted by {x)ljj^. The equivalence relation =\nc (resp. =nc) is defined as the 
congruence relation induced by the rewrite system XUc. (resp. He). 

Typing Contexts: Typing contexts in PTS£ contain more information than 
in PTS. In order to combine declaration and definition of variables within a 
typing context, we associate a type and a term to the each variable declaration. 
Indeed, a typing context in PTS^ is an explicit substitution having the form 
Ml -Ai ■ ■ ■ • Mn -An. T”j where, for 0 < z < n. Mi is either i or is equal to a term 
fVi[T*]. When Mi is i, we say that index z is declared; otherwise, we say that 
the index i is defined. Furthermore, Mi and Ai are called the term definition 
and type declaration of the index z, respectively. Note that not every explicit 
substitution denotes a typing context. We use the Greek letters P and A to 
range over explicit substitutions denoting typing contexts. 

Free indices in term definitions and in type declarations obey different con- 
ventions. Let r be M\ -Ai - ■ ■ ■ Mn -a„ T”- Free indices in a term definition Mi are 
absolute, i.e., they refer to the whole context P. Cyclic definitions are avoided 
by construction. Notice that we require Mi to be equal to fVi[t*], for some term 
Ni. In that case, for all free indices j of Mi, it holds that j > i. On the other 
hand, free indices in a type declaration Ai are relative, i.e., they refer to the 
portion of the context where the index z is declared or defined. Therefore, by 
using this convention, cyclic declarations are impossible. 

Although a different convention for the free indices in a typing context is still 
possible, we prefer the one sketched above since it allows an elegant encoding 
of contexts as explicit substitutions. The intention is to identify the evaluation 
of a term M in a context P as the TT^-reduction of M[P]. In particular, notice 
that a typing context without definitions has the form 1-Ai . . . • n • a^ T”- This 
substitution TT^-reduces to Therefore, as expected, the evaluation of a term 
M in & context P which does not contain definitions results in M . 

The fact that indices are either defined or declared is rather a convenient way 
to explain typing contexts. The type system does not really distinguish both 
classes of indices. 

Meta- variables and Constraints: Meta-variables and constraints are used to 
deal with higher-order unification problems. Informally, meta- variables stand for 
instantiation variables and constraints are term equalities to be solved. 

Meta- variables are first-class objects in PTS^. Just as variables, they have 
to be declared in order to keep track of possible dependencies between terms 
and types. A meta-variahle declaration has the form X\ pA, where P and A are, 
respectively, a context and a type assigned to the meta-variable X. Indices in A 
are relative to P. 
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A constraint M N relates two terms M, N and a context F. Indices in 
M and N are relative to F. Similarly to meta- variables, constraints respect a 
typing discipline: terms M and N have the same type in F. 

Definition 1 (Constrained Signatures). A (constrained) signature is a list 
containing meta-variable declarations and constraint declarations. An empty sig- 
nature is denoted by e. Furthermore, if S is a constrained signature, X\ pA. E 
and M c:ir N . E are well- formed constrained signatures. 

We overload the notation E\ . E 2 to write the concatenation of the signatures 
El and E^ . A meta- variable X is fresh with respect to a signature E, denoted 
X ^ E, A there are not A and F such that X: pA is in E. 

Equality: In order to deal with constraints, PTS^ needs a finer notion of con- 
vertibility than that for PTS (Section 1). 

Definition 2 (Equivalence Modulo Constraints). Let E be a signature; we 
define the relation =p as the smallest congruence such that (1) if M =\nc 
then M =p N, (2) if M ~p N £ E, then M[F] =p A^[E], and (3) if M =p 
then \a-{M F) =p N. 

The last case of the definition above handles /^-conversions. In this way, we 
avoid to consider an vy-rule explicitly in the A7Ti;-calculus. 

We extend =i; to relate typing contexts as follows: (1) =i: T” and (2) 

M -aF =E N -B A,ii M =E N, A[F] =s B[A], and F =e A. 

PTS£ Typing Rules: In PTS, typing rules for validity of typing contexts are 
implicit. However, in that case, structural rules (Start) and (Weak) are neces- 
sary to create an initial context and to add new variable declarations to it. The 
notation E; F captures that F is valid in the valid signature E. We write 
E; F M : A to state that the term M has type A in A; F. For explicit sub- 
stitutions S we write A; T S' c> A to state that S has type A in A; F. The 
type system for PTS^ is given in Figure 4. The judgments omits A, when it is 
e, and F , when it is We reserve F for judgments where A does not contain 
constraints, otherwise we use . A signature A is valid if A holds. 

We use the following functions on contexts: 

— Fo'l^ = A, where A is the normal form of A o')* with respect to the rewrite 
system composed by the rules (Map), (Ids), (ShiftCons), and (ShiftShift). 
We also have to consider the case when i is negative: if F =Bc Ao^ and 
j > i, we consider F equal to Aoy~'‘. 

— To add and remove elements from a context we use the shorthands 
push{M, A, F) = M -A (F o |^), top{M -a F) = A[F], and pop{M -a F) = 

— The order of a substitution S indicates how many term definitions will 

be either consumed, if the number is negative, or produced, if the num- 
ber is positive, when S is applied to a context F. Formally, order(t”) = 
— n, order {M ■ A S) = l-\- order {S), order {S oT) = order (S) -\- order (T). 
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he;! 



TT (Empty) 



(XT T\ n 

A, E) (Var-Decl) 



|~i:;E X ^ S 
\^X: rs. E;r 



(Meta-Var-Decli ) 



S; r (si , S2) G A 

U-, r Si : S2 

i:;_r|~A : Si 
U‘, A, r) B : S 2 

(si, S2, <53) ^ 

r FI A B : S3 

I!;ry^M : Ba-B 
E\ry-N : A 



(Sort) 



(Prod) 



E-r\^{MN) : B[N[r] -A T ] 



TIT (Appl) 






x-.^Aes 



E:A\^ A •. s [ZiJ 0=1: [EJ 0 
E-,r\~^X-. A|rj 



i:; r U Ml : A E-r\^ M2 : A 
All — r AI2 . E r 



(Constraint) 



E-ry.M:A E-ry.A:s 
|~ i:; push{M[ro^^],A, B) 



E-ry^A : s X ^ E 
y-^X:rA\r\. E\F 



(Meta-Var-Decl2) 



i:; r 

P |~ 1 . : top{r 



• (Var) 



X'; push{^, A, r) y^ M : B 
E; B y^ n A- B : s 
E',ry^XA-X[ : llA[r]-^ 



(Abs) 



(Meta-Var) 



X;T X: zis G X 

s is a top sort L-^Jo =i: L-^Jo 
E;ry^X : s 



(Meta-Var-Sort) 



X;r|~ 5 >A X;A|~M:A 
X; A |~ A : s n — order{S) 



X;r|~M[S] : A[T"^] 



(Clos) 



X;r|~S > A 

X;A|~M:s sisa top sort 

E-,ry^M[s\ : s 



(Clos-Sort) 



K E-r 

xTrPT’TT 



(Id) 



|~ X; P X; pop{r) |~ 'f”' > A 
> A 



(Shift) 



E-,ry^ s > Ai 

X;Ai |~P > A2 
X;r|--To5> A2 



(Comp) 



X; r M : A[S] X; P S > A 

X; A |~ A : s n — order(S) ^ 

! I — (Cons) 

X;P|~M-a S> pus/i(M[PoT ],A, A) 



X;P|~M : A 
E;ry^B : s 

A =r B[r] =\nr B 
— E-r yShi : B — 



(Conv) 



X;P|~S > Ai 
|~ X; A2 



Ai =z; A2 
X; 7 ’ |~ 6 ' > A2 



(Conv-Subs) 



Fig. 4. The A7Ti;-type system 

— Given a context F, the operation [Tjo computes a new context where all the 
definitions in F have been transformed into declarations, as follows [t™Jn = 

r, [M ■AF\n = n±l-A[F\n+i- 



Relating PTS^ to PTS: Since PTS^ allows arbitrary constraints between 
terms, strong normalization, as well as other usual typing properties, can be 
easily violated in arbitrary PTS£. For instance, the term (Aa;:r4.(a; x) Xx:A.{x a;)) 
can be typed in any PTS^ containing the constraint A c:ir A ^ A. However, as 
we will see below, PTS^ is a conservative extension of PTS. Furthermore, when 
only pure expressions are considered, i.e., signatures are empty (and then we use 
h rather than ), PTS^ types as many terms as PTS (modulo TT^-reductions) . 

We say that a typing context is pure if it is the identity substitution or it 
has the form 1_ -Ai ■ ■ .n -a„ T” &nd Ai is pure for 1 < z < n. Notice that pure 
contexts reduce to via application of rules (ShiftO) and (Shift S). 
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Theorem 1 (Conservative Extension). Let M,A be pure terms and F be a 
pure context. Then, F \- M : A in a PTS, if and only if F \- M : A in PTSc (we 
assume de Bruijn indices translation. For details of this translation see [9]). 

Proof. Both cases require typing properties for PTS and PTS^. We refer to [14] 
for the proofs of properties in PTS and to Section 4 for a summary of properties in 
PTSc- The case PTS PTSc, proceeds by induction on the typing derivation. 
Note that if M =fjri N then M =s N. For the case PTSc PTS, we first prove 
by induction on the typing derivation that for arbitrary terms N,B, and pure 
context F, F \- N : B in PTSc implies F h : {B)l^^. □ 

3 Definitions and Explicit Instantiations 

In this section, we address two novel aspects of AiT^: the ability to encode let-in 
expressions and the support for expression contexts. 

Definitions: Let-in expressions are a convenient way to support definitions 
in A-calculus. In the simply-typed A-calculus an expression as let x := N : 
A in M can be encoded as {Xx:A.M N). In this case, definition unfolding 
is performed by the /3-reduction mechanism. The behavior of definitions in a 
dependent- type system cannot be straightforwardly encoded as a /3-reduction. 
Consider, for example, the expression let a; := 0 : nat in (m (/ x)) in a context 
F = m : (A 0) — > nat. 0 : nat. I : {IIn:nat.{A n)). A : nat Type, nat : Type.^ 
Although this term is unfolded into (m {I 0)) in the same way that the term 
{{Xx:nat.{m {I x))) 0) /3-reduces to (m {I 0)), the term {{Xx:nat.{m {I x))) 0) 
cannot be typed in F. This is because the information that the variable x will 
be substituted by 0 in (m (/ x)) is not taken into account by the application rule. 
Indeed, the type of {I x) is {A x), not {A 0) as expected by m, and, therefore, the 
term Xx:nat.{m (/ x)) is ill-typed in F. Solutions to the above problem require 
either to consider definitions as first-class terms (not just a macro expansion to 
a /3-redex) [24] or to use a different notation and typing rules for applications [4] . 
In [3], Bloo proposes a calculus of explicit substitutions for PTS with contexts 
extended with term definitions. 

As a side effect to combine definitions and declarations in contexts, let-in 
expressions can be encoded as explicit substitutions. In PTSc, as well as in Bloo’s 
system, let x := N : A in M is just a shorthand for the term M[x := N : A] 
(or M[N -A T°] in our nameless notation). On the other hand, the typing rule 
for applications remains unmodified. 

In contrast to Bloo’s approach, we use a uniform notation for typing con- 
texts and explicit substitutions. Furthermore, we internalize completely the sub- 
stitution mechanism within the theory. In particular, we do not require any 
implicit substitution mechanism. Implicit substitutions are problematic when 
meta- variables are allowed. Notice that meta- variables, in a dependent-type the- 
ory, may also appear in typing contexts. 

® For readability, we use named variables when discussing examples. Nevertheless, as 
we have said, PTSc uses a de Bruijn nameless notation of variables. 
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Explicit Instantiations: In term rewriting, an expression context is an ex- 
pression with distinguished holes. Filling a hole in a expression context with an 
arbitrary term is a first-order substitution that does not care of renaming vari- 
ables. It is well-known that expression contexts in A-calculus, and in general in 
higher-order rewrite systems, raise technical problems. For instance, contexts are 
not compatible with a and /3-conversion. Calculus of contexts for simply-typed 
A-calculus have been studied in for instance [5]. In previous approaches, either 
/3-redexes cannot be reduced in context terms or new binding structures together 
with delicate type systems have been required to handle holes and explicit filling 
of holes. 

Explicit substitutions overcome these difficulties in A-calculus, and thus, they 
allow us to formulate PTSc as A-calculus of contexts where meta- variables denote 
holes. To complete the framework, we must provide an explicit mechanism to 
perform instantiations. 

Up till now, met a- variables were given in an abstract set V. In order to 
remain in a first-order setting, we index meta-variables with positive natural 
numbers. Therefore, we write Xp for a meta-variable indexed by the positive 
natural number p. Expressions of a PTSc-system with explicit instantiations 
include M{9} and 5'{0} as first-class terms and substitutions where 6* is a set of 
instantiations. We use list of terms to represent a set of instantiations. 

An instantiation 6 is well-formed if it has the form Mi • . . . M„ ■ m, where 
n < m and each term Mi is either Xi or a term which does not contain the 
meta- variable Xi. In the latter case. Mi is called the instantiation term of Xi in 
9. The grade of 9, also written grade{9), is given by m — n. 

The structure of explicit instantiation is analogous to the structure of explicit 
substitutions. Indeed, an instantiation M ■ 9 denotes the replacement of meta- 
variables Xi by M, X 2 by the head of 9 and so forth. The lookup mechanism 
of meta-variables in an explicit instantiation is also analogous to the lookup 
of variables in an explicit substitution. The index p of the meta-variable Xp is 
consumed at the same time that the instantiation 9 is traversed. The grade of 9 
helps to reconstruct the original index of the meta- variable in the cases of failed 
look-ups, i.e., the meta-variable Xp does not appear in 9. 

In contrast to explicit substitutions, instantiations do not care about recalcu- 
lation of free-indices in expressions. The calculus of explicit instantiations, called 
is depicted in Figure 5. 



lie} 

(nA.B)\e} 
(AA.iv){e} 
(Ni N2)\e\ 
(AT[S]){e} 



S 

1 

nA{eiB{6} 

(ATi{e} Afaie}) 
T” 



(N -A S)ie} 

(SoT){S} 

Wlo} 

-l- 1} 

Xi{M ■ ej 

Xp+i{M-0} 



s{0} 

Xp 

Xp+i{n} 

M 

Xp{0} 



Fig. 5. The A^£}-rewrite system 



Lemma 2. The X^p^-rewrite system is confluent and terminating. 
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Proof. To show that is terminating, we use a measure that penalizes top- 
level application of instantiations and the height of trees representing instantia- 
tions. Furthermore, is orthogonal, then it is confluent. □ 



Typing rules for explicit instantiations are shown in Figure 6. We extend the 

relation = 1 ; to consider the case of -reductions as follows: if M ► N, then 

M =s N . We also have a new kind of typing judgment: if; n 6* if'. It has a 
double purpose. In first place, it enforces the invariant n = grade(&). Secondly, 
it states that for all instantiation term Mi in 9, Mi is well-typed with respect to 
the meta- variable Xi. In that case, a constraint Xi Mi is accumulated in a 
new signature S' . 



X'; 0 0 U' 

X;r|~M : A 
s'. S-,r^ M{e\ : A 



(Inst-Term) 



X; grade(O) U' 

^ ^ ^grade(0)-\-p • 

X'. X;r|~Xp{0} : A 



(Lookup) 



X;n+l[^^^X^ 
X; n Xn+i ■ 9 ^ X 



(List-Insti ) 



X; 0 0 X' 

X;r^S> A 
X'. X;r|^S{0} > 



(Inst-Subs) 



XT 



JX 
1 1 



• (Empty-Inst) 



X' 

X — Xi . Xn+i : rA. X 2 

E2\ry-M : A 

X; n |~ Af ■ 9 ^ Xn+i —r Af. X' 



(List-Inst2 ) 



Fig. 6. Typing rules for explicit instantiations 



We denote by a:{ 6 *} 4 , the T^^f-normal form of an expression x{9\ where x 
is either a term or a substitution. We extend this notation to contexts and 
signatures as follows: e{9\i = e, ( 0 l.F){ 6 *}| = 0 l{ 6 l}|.F{ 6 *}p {X: rA. T'){ 6*}4 = 
X: and (M ~r N. S){9}^ = M{9}^ 

PTS£ allows to represent open terms in PTS, i.e., terms with meta- variables. 
The instantiation mechanism of meta-variables fills the place-holders in open 
terms. This way, an open term in PTS£ gradually becomes a pure term in PTS. 

Lemma 3 (Instantiation). Assume T';0|~6* => S'. (1) If S; A, then 
^S{9}^i;A{9}^l, (2) if S;AY^N : B, then S{9]^i; A{9}^^^ N{9}^i : B{9}i, 
and (3) if T"; Z\i S' > A 2 , then S{9\i] S{ 6 *}| o Z\ 2 { 6 *}p 

Proof. First, we prove by simultaneous induction on terms and substitutions 
that fb^£}-reductions preserve typing, i.e. if if; T |~M : A and M — N, then 

S\ r N : A, and if T"; F S > Z\ and S — T, then S; P S > A. Hence, 
we can prove that (1) S' . S{9}i;A{9}i, (2) S'. T'{ 6 *} 4 ; Z\{0}| fV{ 6 l}| : 

B{6}i, and (3) S'. S{9\i\ Ai{9\i |~S{ 6 l}j^ o A2{9\i. Finally, we use a weakening 
property on signatures which states that constraints of the form Xp csp M can 
be safely removed from a signature when the meta-variable Xp does not appear 
as a sub-term in the remaining signature. □ 
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4 Meta-theory 

We here summarize a meta-theoretical study for dependent-type versions for 
XUc- It extends [20] by considering explicit definitions and instantiations. Proofs 
are available at [21]. 

First of all, we prove some properties about contexts. The first explains the 
meaning of the typing rules for substitutions. The second states that terms in 
valid contexts always are in normal form with respect to term definitions. The 
third property states that the order of a valid contexts is 0. The last one states 
that contexts type themselves. 

Lemma 4 (Properties about Valid Contexts). (1) // if; T [~ S' c> A, then 
Sor=E Ao ^ (2) If r, then FoT =Uc r. (3) If |- V; T, then 

order(r) = 0. (4) If then r;T[-rt> T. 

Proof. All the proofs are by induction on the typing derivations [21]. □ 

We cannot go further with usual properties as subject reduction, confluence, 
or normalization. As we have mentioned before, when arbitrary constraints are 
allowed, these properties can be easily violated. To continue our meta-theoretical 
study we make some assumptions. First of all, we consider only functional PTSc. 
Hence, we include all the type-systems of the Barendregt’s cube. Furthermore, 
we consider signatures without constraints. Then we use h rather that [~ to 
denote typing judgments. 

Even with the assumptions above, the system XUc. is not confluent on pseudo- 
expressions, i.e., on expressions that are not well-typed. To handle this problem, 
we follow the development in [20]. The approach is originally due to Geuvers 
[13]. It exploits the fact that the rewrite-system XII c. without type annotations 
is confluent. We prove the following key lemma. 

Lemma 5 (Geuvers’ Lemma). Given a signature S without constraints, (1) 
if IIai.Bi =s IIa 2 -B 2 , then Ai =s A 2 and B\ =s B 2 , and (2) if {c Mi . . .M„) 
=s N, then N .hEA [c N\ . . .N„) where Mi =s Ni. 

Proof. First, we prove that lie is confluent on expressions without type anno- 
tations on substitutions. Notice that rj is not a rewrite-rule and that only meta- 
variables on the sort of terms are allowed. Then, we show that TT^commutes 
with the parallelization of (Beta) [20]. The conclusion follows from a positive 
use of the counter-example for the confluence of the system with type annota- 
tions [13, 20]. □ 

Geuvers’ lemma is enough to prove the following properties. Most of them 
require simultaneous induction on terms and substitutions. 

Sort Soundness: If B-, P \- M : A, then either A = s, where s is a top sort, 
or S; r \- A : s, where s € S. 

Type Uniqueness: For a functional PTS£, if S; P \~ M : A and S; P \- M : 
B, then A =s B. 
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Subject Reduction: For a functional PTSc, if M TV and S: F \- M -.A, 
then S-r^ N ■. A. 

Church- Rosser: For a functional weakly normalizing PTS£, if Mi =\nc ^ 2 , 
E; r \- Ml : A, and E; F \- M2 : A, then Mi and M2 are ATT^-joinable, i.e., 
there exists M such that Mi M and M 2 RRE^ M. 

It is well-known that explicit substitutions calculi based on Act do not satisfy 
the strong-normalization property, not even in the simply-typed theory [19]. In 
[ 20 ] , it is shown that ATT^ without constraints and without ^equality is weakly 
normalizing on the Calculus of Constructions. 

5 Applications 

The PTSc-type system cannot be applied bottom up because some of the rules 
require premises that are not present in the conclusion, and others require to 
check equality under =s- On the other hand, it is possible to divide type check- 
ing into three sub-tasks that can be treated separately: ( 1 ) type inference, ( 2 ) 
checking (ground) equality, and (3) unification. This section therefore develops 
the relevant machinery for accomplishing these tasks. 

Type Inference: We obtain a type inference system from =s by converting 
the judgments E\F\^M : A and E; F S > A into E; F j^M A, E' , 

respectively, E;F[^j^S A, E' , where the type A, context A, and 

auxiliary constraints A7' are produced by applying the type inference rules. For 
lack of space we will not give the full inference system here, but illustrate some 
of the more interesting cases: the rule to infer the type of an application. The 
full system appears in [ 21 ]. 

E;F^j^M Ai, El Ei;F^j^N A2, E2 

^ 2 ] j^Ai : S S 3 , E^ E3] F j^A2 : S si, E^ 

F[ ^ Ei {si, S2, S3) £ 71 A = push{l, A2, F) 

E3 = Ai IlA-i-H. F7: ^82- 
E-FY^j^{MN) H\N-a,R],E3 

These rules can be encoded directly with Braga’s conditional rewrite rule exten- 
sion to Maude [7] . 

Checking Equality: The rewrite-system XFlc. provides a way to check equality 
under /3-reduction, but not 77 equality. One could add an ? 7 -rule as a conditional 

rewrite rule Xa-{M 1_) «- NAM =\nc -^[T^]i and check equality of terms 

by first applying reduction to normal form. A more economical approach is to 
interleave weak head normal form conversion with rules that decompose equali- 
ties into smaller equalities. By weak head normal form we here take the minimal 
reduction under A7Ti;-system that allows to decompose an equality using the 
other rules in Fig. 7. The cases not mentioned there are assumed as failures. 

Our rules are directly structural on the shape of terms. We should note 
that the X — N rule realizes ^-equality, and that the A — A rule does not check 
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whnf 


M N 




whnf(M) ~r whnf (A) 


A- A 


Xa-M ~r Xb-N 




M —push(i,A,r) N 


77-77 


IIai-Bi ~r IIa 2 -B 2 




Ai ~r A 2 . Bi '^push(i,Ai,r) B2 


X-N 


Xa-M —p N 




M —push{i,A,r) (A^[T^] 1) 


Subst 


M N 




M[F] ~r N[F] 


S-S 


s s 




e 


App-App 


{nMi ... Mk) (n Ni . 


■ ■ Nk)=^ 


Ml ~r Ai. ... Mk ~r Nk 



Fig. 7. Rules for checking term equality 



equality of types because we assume that type constraints on the equated terms 
are checked by other constraints. The alternative notion of algorithmic equality 
studied by Harper and Pfenning [15] recurses on the structure of types in the 
context of LF. It is so far unclear how this idea adapts to type systems with 
impredicative type polymorphism. 

Unification and Inhabitation: With the available machinery, the inhabita- 
tion and unification problems for pure type systems is now a relatively straight- 
forward adaption from [20, 10], and for the case of higher-order pattern unifi- 
cation from [11]. The full paper [21] contains a detailed treatment, but here we 
highlight the main ideas. As usual in higher-order unification, we distinguish 
between the cases of rigid-rigid, flex-rigid, and flex-flex pairs. 

Rigid- Rigid. The rules from Fig. 7 can be directly adapted to decompose rigid- 
rigid pairs. 

Flex- Rigid. To solve flex-rigid pairs we need in general to generate terms with 
de Brujin indices applied to meta-variables. To do this in a way that preserves 
types we here introduce an auxiliary operation \i that produces an application 
to i fresh meta-variables and calculates its type. 

N ■. A, Si Si-,r^A:S =A S 2 , S 2 
Xi+i,Yi,Y2 ^ Y 2 Si G 5 (si,S2) G -4 A — push{'l_,Y-]_i F) 

El Xj+i: pYi. Y 2 : AS2- Yi: rsi. E2‘, F {A 2:^ Uy^ -Y2} E^ 

e-F[^^oM ^ M:A,E\ 

A useful property about these judgments is that if S; F [~^iM fails (produces 
an unsatisfiable constraint), then S; F fails for every j > i. This restricts 

unbounded branching in the cases where we do not make direct use of impred- 
icative type abstraction. For example, in LF one computes the necessary i by 
inspecting the number of FI nested in A. 

To prepare elimination of a meta- variable in a flex-rigid pair by instantiation, 
we furthermore must order meta-variable declarations in signatures according 
to their dependencies. For this purpose we introduce an operation to permute 
signatures. Permutations respect type derivation in PTS^. 
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As the final preparation to the flex-rigid rules, we replace occurrences of 
(X[S'] M) in E by ((A^j.F)[S'] M), for a fresh meta-variable pMs/i(i,Ai,r)4l2, 
when X: -^2 G S. This produces flex-rigid pairs, that we can Anally reduce 

as follows: 



i: ^ E3{n/x\i if 



{ (X[Mi-Ai...Mp-ApT”]-r(mAfi ■ ■ ■ N,)) & S, X: e E 

h G {1, . . . , p} U (if m > n then {m — n + p} else 0) 
i>0 U;r\--^ih ^ N : A, 

Ui; r {A B} B2 permute{U2) B4 . A: x B. B3 



Flex-Flex. When all rigid-rigid and flex-rigid pairs are solved, we are left with 
flex-flex pairs of the form A[Mi -Ai ...Mp -Ap T"] — r Y[Ni -b^ ■ ■ ■ Nq -b, T™]- 
For an arbitrary PTS these equations are solved by enumerating inhabitants 
for X and Y and checking equality using the other rules. The main tool for 
enumerating the inhabitants is \i exemplified for the calculus of constructions 
in [20]. This is unlike the situation when one of the substitutions is a pattern 
substitution [11], or if the types of X and Y are simple [16]. 



6 Related Work and Conclusion 

Dependent-type versions of XUc have been studied in [20]. The definition of 
PTSb that we present here generalizes those versions in several ways. First of 
all, we consider /3?7-equality instead of just /3-equality. Secondly, we also identify 
contexts with a particular kind of explicit substitutions. Thus, let-expressions 
can be easily encoded in PTS£. Finally, we advocate for explicit instantiations 
as a way to obtain a calculus of contexts. 

Pure type systems and explicit substitutions have been studied in [25, 3]. 
A polymorphic calculus with explicit substitutions has been proposed in [6]. In 
contrast to our work, in those approaches explicit substitutions are not first- 
class objects and atomic implicit substitutions are required by the type systems. 
In [18], Magnusson uses met a- variables and explicit substitutions to represent 
incomplete proof-terms in the Martin Lof Type Theory, but a complete meta- 
theoretical study of the system is missing. More recently, Strecker has devel- 
oped in [26] a complete meta-theory for a variant of the Extended Calculus of 
Constructions [17] that supports meta-variables. Strecker’s system provides an 
explicit notation for substitutions. However, it cannot be really considered as 
an explicit substitution calculus. Indeed, in Strecker’s approach, the substitu- 
tion mechanism is implemented by means of an external substitution operation, 
and the explicit notation is just used to suspend the application of external 
substitutions to meta-variables. That is, explicit substitutions are only allowed 
when they are attached to meta-variables. The PTS^-system completely inter- 
nalizes the substitution mechanism into the theory. This theoretical treatment 
of substitutions allows a finer and granular control on the application of substi- 
tutions. In particular, Strecker’s definition of f3 corresponds to a strategy in AiT^ 
where a (Beta)-rule if followed by TT^-normalization. Unification algorithms for 
dependent types have been studied previously in [22, 12], and for the case of 
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explicit substitutions in [10]. In comparison to Agda [8] our calculus erases the 
distinction between standard term and term with an explicit substitution. 
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Abstract. In the last twenty years, several approaches to higher-order 
rewriting have been proposed, among which Klop’s Combinatory Rewrite 
Systems (CRSs), Nipkow’s Higher-order Rewrite Systems (HRSs) and 
Jouannaud and Okada’s higher-order algebraic specification languages, 
of which only the last one considers typed terms. The later approach has 
been extended by Jouannaud, Okada and the present author into Induc- 
tive Data Type Systems (IDTSs). In this paper, we extend IDTSs with 
the CRS higher-order pattern-matching mechanism, resulting in simply- 
typed CRSs. Then, we show how the termination criterion developed 
for IDTSs with first-order pattern-matching, called the General Schema, 
can be extended so as to prove the strong normalization of IDTSs with 
higher-order pattern- matching. Next, we compare the unified approach 
with HRSs. We first prove that the extended General Schema can also 
be applied to HRSs. Second, we show how Nipkow’s higher-order critical 
pair analysis technique for proving local confluence can be applied to 
IDTSs. 

Appendices A, B and C (proofs) are available from the web page. 



1 Introduction 

In 1980, after a work by Aczel [1], Klop introduced the Combinatory Rewrite 
Systems (CRSs) [15, 16], to generalize both first-order term rewriting and rewrite 
systems with bound variables like Church’s A-calculus. 

In 1991, after Miller’s decidability result of the pattern unification problem 
[20], Nipkow introduced Higher-order Rewrite Systems (HRSs) [23] (called Pat- 
tern Rewrite Systems (PRSs) in [18]), to investigate the metatheory of logic 
programming languages and theorem provers like AProlog [21] or Isabelle [25]. 
In particular, he extended to the higher-order case the decidability result of 
Knuth and Bendix about local confluence of first-order term rewrite systems. 

At the same time, after the works of Breazu-Tannen [6], Breazu-Tannen 
and Gallier [7] and Okada [24] on the combination of Church’s simply-typed 
A-calculus with first-order term rewriting, Jouannaud and Okada introduced 
higher-order algebraic specification languages [11, 12] to provide a computational 
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model for typed functional languages extended with first-order and higher-order 
rewrite definitions. Later, together with the present author, they extended these 
languages with (strictly positive) inductive types, leading to Inductive Data 
Type Systems (IDTSs) [5]. This approach has also been adapted to richer type 
disciplines like Coquand and Huet’s Calculus of Constructions [2, 4], in order 
to extend the equality used in proof assistants based on the Curry-De Bruijn- 
Howard isomorphism like Coq [10] or Lego [17]. 

Although CRSs and HRSs seem quite different, they have been precisely 
compared by van Oostrom and van Raamsdonk [31], and shown to have the 
same expressive power, CRSs using a more lazy evaluation strategy than HRSs. 
On the other hand, although IDTSs seem very close in spirit to CRSs, the relation 
between both systems has not been clearly stated yet. 

Other approaches have been proposed like Wolfram’s Higher-Order Term 
Rewriting Systems (HOTRSs) [33], Khasidashvili’s Expression Reduction Sys- 
tems (ERSs) [14], Takahashi’s Conditional Lambda-Calculus (CLC) [27], . . . (see 
[29]). To tame this proliferation, van Oostrom and van Raamsdonk introduced 
Higher-Order Rewriting Systems (HORSs) [29, 32] in which the matching pro- 
cedure is a parameter called “substitution calculus” . It appears that most of the 
known approaches can be obtained by using an appropriate substitution calculus. 
Van Oostrom proved important confluence results for HORSs whose substitu- 
tion calculus fulfill some conditions, hence factorizing the existing proofs for the 
different approaches. 

Many results have been obtained so far about the confluence of CRSs and 
HRSs. On the other hand, for IDTSs, termination was the target of research 
efforts. A powerful and decidable termination criterion has been developed by 
Jouannaud, Okada and the present author, called the General Schema [5]. 

So, one may wonder whether the General Schema may be applied to HRSs, 
and whether Nipkow’s higher-order critical pair analysis technique for proving 
local confluence of HRSs may be applied to IDTSs. 

This paper answers positively both questions. However, we do not consider 
the critical interpretation introduced in [5] for dealing with function definitions 
over strictly positive inductive types (like Brouwer’s ordinals or process algebra) . 
In Section 3, we show how IDTSs relate to CRSs and extend IDTSs with the 
CRS higher-order pattern-matching mechanism, resulting in simply-typed CRSs. 
In Section 4, we adapt the General Schema to this new calculus and prove in 
Section 5 that the rewrite systems that follow this schema are strongly normal- 
izing (every reduction sequence is finite). In Section 6, we show that it can be 
applied to HRSs. In Section 7, we show that Nipkow’s higher-order critical pair 
analysis technique can be applied to IDTSs. 

For proving the termination of a HRS, other criteria are available. Van de 
Pol extended to the higher-order case the use of strictly monotone interpreta- 
tions [28]. This approach is of course very powerful but it cannot be automated. 
In [13], Jouannaud and Rubio defined an extension to the higher-order case of 
Dershowitz’ Recursive Path Ordering (HORPO) exploiting the notion of com- 
putable closure introduced in [5] by Jouannaud, Okada and the present author 
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for defining the General Schema. Roughly speaking, the General Schema may 
be seen as a non-recursive version of HORPO. However, HORPO has not yet 
been adapted to higher-order pattern-matching. 

2 Preliminaries 

We assume that the reader is familiar with simply-typed A-calculus [3] . The set 
T{B) of types s,t, . . . generated from a set B of base types s, t, . . . (in bold font) 
is the smallest set built from B and the function type constructor We denote 
by FV{u) the set of free variables of a term u, ulp (resp. u t'') the /3-normal 
form of u (resp. the vy-long form of u). 

We use a postfix notation for the application of substitutions, {x\ ^ u\, . . 

Xn Un\ for denoting the substitution 9 such that XiO = Ui for each i S 
{1, . . .,n}, and 0 l±l {x i-^- u} when x ^ dom{9), for denoting the substitution 
9' such that x9' = u and y9' = y9 ii y ^ x. The domain of a substitution 
9 is the set dom{9) of variables x such that x9 yf x. Its codomain is the set 
cod{9) = {x9 I X € dom(9)}. 

Whenever we consider abstraction operators, like A_._ in A-calculus, we work 
modulo a-conversion, i.e. modulo renaming of bound variables. Hence, we can 
always assume that, in a term, the bound variables are pairwise distinct and 
distinct from the free variables. In addition, to avoid variable capture when 
applying a substitution 0 to a term u, we can assume that the free variables of 
the terms of the codomain of 9 are distinct from the bound variables of u. 

We use words over positive numbers for denoting positions in a term. With 
a symbol / of fixed arity, say n, the positions of the arguments of / are the 
numbers i € {1, . . . , n}. We will denote by Pos{u) the set of positions in a term 
u. The subterm at position p is denoted by u\p. Its replacement by another term 
V is denoted by u[v]p. 

For the sake of simplicity, we will often use vector notations for denoting 
comma- or space-separated sequences of objects. For example, {x i— > u} will 
denote {x\ i— > ui,...,x„ i-^- u„}, n = |i6| being the length of u. Moreover, 
some functions will be naturally extended to sequences of objects. For example, 
FV{u) will denote Ui<i<n'^^(^i) sequence Ui9 . . .u„6*. 

3 Extending IDTSs with Higher-Order Pattern-Matching 
a la CRS 

In a Gombinatory Rewrite System (GRS) [16], the terms are built from variables 
x,y, . . . function symbols /, g, . . . of fixed arity and an abstraction operator [_]_ 
such that, in [x]u, the variable x is bound in u. On the other hand, left-hand 
and right-hand sides of rules are not only built from variables, function symbols 
and the abstraction operator like terms, but also from metavariables Z, Z', . . . 
of fixed arity. In the left-hand sides of rules, the metavariables must be applied 
to distinct bound variables (a condition similar to the one for patterns o la 
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Miller [18]). By convention, a term headed by [xi], . . [a;„] can 

be replaced only by a term u such that FV{u) fl {x\, . . Xn} C {xi-^, . . Xi^.}. 

For example, in a left-hand side of the form f{[x] [y]Z(x)), the metaterm Z{x) 
stands for a term in which y cannot occur free, that is, the metaterm [a;] [y]Z(a;) 
stands for a function of two variables x and y not depending on y. 

The A-calculus itself may be seen as a CRS with the symbol @ of arity 2 for 
the application, the CRS abstraction operator [_]_ standing for A, and the rule 

@{[x]Z{x),Z') Z{Z') 

for the /3-rewrite relation. Indeed, by definition of the CRS substitution mecha- 
nism, if Z(x) stands for some term u and Z' for some other term v, then Z(Z') 
stands for u{x i-^- w}. 

In [5], Inductive Data Type Systems (IDTSs) are defined as extensions of the 
simply-typed A-calculus with function symbols of fixed arity defined by rewrite 
rules. So, an IDTS may be seen as the sub-CRS of well-typed terms, in which 
the free variables occuring in rewrite rules are metavariables of arity 0, and only 
[3 really uses the CRS substitution mechanism. 

As a consequence, restricting matching to first-order matching clearly leads 
to non-confluence. For example, the rule 

D{\x.sin{F x)) Ax.(D(F) x)xcos(F x) 

defining a formal differential operator L> over a function of the form sin o F, 
cannot rewrite a term of the form D{Xx.sin{x)) since x is not of the form (u x). 

On the other hand, in the CRS approach, thanks to the notions of metavari- 
able and substitution, D may be properly defined with the rule 

D{[x]sin{F{x))) [x] @(I?([i/]F(i/)), x) x cos(F(x)) 

where F" is a metavariable of arity 1. 

This leads us to extend IDTSs with the CRS notions of metavariable and 
substitution, hence resulting in simply-typed CRSs. 

Definition 1 (IDTS - New Definition). An IDTS-alphabet A is a 4~tuple 
(B,X,F,Z) where: 

- B is a set of base types, 

- X is a family {Xt)t^T(B) of sets o/ variables, 

~ F is a family {Fs^,...,s„,s)n>o,si,...,s„,s(^TiB) of sets o/ function symbols, 

- Z is a family (^si,...,s„,s)„>o,si,....s„,ser(B) of sets o/ metavariables, 
such that all the sets are pairwise disjoint. 

The set o/ IDTS-metaterms over A is T{A) = UteT(B)^* where It ore the 
smallest sets such that: 

(1) Xt C It, 

(2) if X G Xs and u G It, then [x]u G Is^t, 

(id) if f G Fg., ,...,sn,s ) ^ ^si 5 ■ ■ ■ 5 G Is ^ ? thcn f{u \, . . . , Unj G Is • 

(4) if Z G G Is.,, ' ' ' , Un G Is„, then Z{u \, . . . , uA) G Is- 
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We say that a metaterm u is of type t G T{B) ifu G It- The set of metavariables 
occuring in a metaterm u is denoted by Var{u). A term is a metaterm with no 
metavariable. 

A metaterm I is an IDTS-pattern if every metavariable occuring in I is ap- 
plied to a sequence of distinct bound variables. 

An IDTS-rewrite rule is a pair I ^ r of metaterms such that: 

(1) I is an IDTS-pattern, 

(2) I is headed by a function symbol, 

(3) Var{r) C Var{l), 

(4) r has the same type as I, 

(5) I and r are closed (FV{1) = FV{r) =%). 

An n-ary substitute of type si ^ ... ^ Sn ^ s is an expression of the form 
)fx).u where x are distinct variables of respective types and u is a 

term of type s. An IDTS- valuation a is a type-preserving map associating an 
n-ary substitute to each metavariable of arity n. Its (postfix) application to a 
metaterm returns a term defined as follows: 

- xa = X 

- {[x]u)a = [x]ua {x ^ FV{cod{a))) 

- f{u)a = f{ua) 

- Z{u)a = v{x ^ U(t} if a-(Z) = A(x).v 

An IDTS I is a pair {A, TV) where A is an IDTS-alphabet and IZ is a set of 
IDTS-rewrite rules over A. Its corresponding rewrite relation is the subterm 
compatible closure of the relation containing every pair la — > ra such that I — > 
r GiZ and a is an IDTS-valuation over A. 

The following class of IDTSs will interest us especially: 

Definition 2 (/3-IDTS). An IDTS {A, IZ) where A = {B, T, T, Z) is a /?-IDTS 
if, for every pair s,t G T{B), there is: 

(1) a function symbol @sx C Fs^t,s,t, 

(2) a rule fls,t = Z') — > Z{Z') G IZ, 

and no other rule has a left-hand side headed by 

Given an IDTS I, we can always add new symbols and new rules so as to 
obtain a j3-IDTS. We will denote by (TI this /3-extension of I. 

For short, we will denote @(. . ui), U 2 ), . . . , u„) by @{v, u). 

The strong normalization of pi trivially implies the strong normalization 
of I. However, the study of (31 seems a necessary step because the application 
symbol @ together with the rule (3 are the essence of the substitution mechanism. 
Should we replace in the right-hand sides of the rules every metaterm of the form 
Z{u) by @{[x]Z{x), u), the system would lead to the same normal forms. 

In Appendix A, we list some results about the relations between I and (31. 

4 Definition of the General Schema 

All along this section and the following one, we fix a given /3-IDTS I = {A, IZ) . 
Firstly, we adapt the definition of the General Schema given in [5] to take into 
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account the notion of metavariable. Then, we prove that if the rules of TZ follow 
this schema, then ->-x is strongly normalizing. 

The General Schema is a syntactic criterion which ensures the strong nor- 
malization of IDTSs. It has been designed so as to allow a strong normalization 
proof by the technique of computability predicates introduced by Tait for proving 
the normalization of the simply-typed A-calculus [26, 9]. Hereafter, we only give 
basic definitions. The reader will find more details in [5]. 

Given a rule with left-hand side f{l), we inductively define a set of admissible 
right-hand sides that we call the computable closure of I, starting from the acces- 
sible metavariables of 1. The main problem will be to prove that the computable 
closure is indeed a set of “computable” terms whenever the terms in I are “com- 
putable” . This is the objective of Lemma 13 below. The notion of computable 
closure has been first introduced by Jouannaud, Okada and the present author 
in [5, 4] for defining the General Schema, but it has been also used by Jouannaud 
and Rubio in [13] for strengthening their Higher-Order Recursive Path Ordering. 

For each base type s, we assume given a set Cs C IJ^^q ^ 6T(B) ^si,...,sp,s 
whose elements are called the constructors of s. When a function symbol is a 
constructor, we may denote it by the lower case letters c,d, . . . 

This induces the following relation on base types: t depends on s if there is 
a constructor c € Ct such that s occurs in the type of one of the arguments 
of c. Its reflexive and transitive closure <g is a quasi-ordering whose associated 
equivalence relation (resp. strict ordering) will be denoted by =g (resp. <g). 

We say that a constructor c G Cs is positive if every base type t =g s occurs 
only at positive positions (wrt. the type constructor — >) into the types of the 
arguments of c. c is basic if it is positive and has no functional arguments. A 
type is positive (resp. basic) if all its constructors are positive (resp. basic). 

Definition 3 (Accessible Subterms). The set Acc{v) o/ accessible subterms 
of a metaterm v is the smallest set such that: 

(1) V € Acc{v) 

(2) if [x]u G Acc{v) then u G Acc{v) 

(3) if c{u) G Acc{v) then each Ui G Acc{v) 

(4) '^f fi'^) G Acc{v) and Ui is of basic type then Ui G Acc(v) 

(5) if@{u,x) G Acc{v), X ^ FV{u) U FV{v) then u G Acc{v) 

(6) if@{x,u) G Acc{v), X ^ FV{u) U FV{v) then each m G Acc(v). 

By abuse of notation, we will say that a metavariable Z is accessible in v if there 
are distinct bound variables x such that Z{x) G Acc{v). 

For example, F is accessible in w = [x]sin(F(x)) since sin{F{x)) is accessible 
in V by (2), and thus, F{x) is accessible in v by (3). 

Gompared to [5], we express the accessibility with respect to a fixed v. This 
has no consequence on the definition of computable closure since, among the 
accessible subterms, only the free variables (here, the metavariables) are taken 
into account. Accessibility enjoys the following property: 

Property 4. If u G Acc{v) then ucr G Acc(va) . 




Termination and Confluence of Higher-Order Rewrite Systems 



53 



For proving termination, we are led to compare the arguments of a function 
symbol with the arguments of the recursive calls generated by its reductions. 
To this end, each function symbol f G is equipped with a status statf which 
specifies how to make the comparison as a simple combination of multiset and 
lexicographic comparisons. Then, an ordering on terms < is easily extended to an 
ordering on sequences of terms < statf The reader will find precise definitions in 
[5]. To fix an idea, one can assume that < statf is the lexicographic extension <iex 
or the multiset extension <mui of <. We will denote by <ftatf (resp. <Jtatf) 
strict ordering (resp. equivalence relation) associated to < statf ^ftatf is well- 
founded if the strict ordering associated to < is well-founded. 

TZ induces the following relation on function symbols: g depends on f if there 
is a rewrite rule defining g {i.e. whose left-hand side is headed by g) in the right- 
hand side of which / occurs. Its reflexive and transitive closure is a quasi-ordering 
denoted by <jf whose associated equivalence relation (resp. strict ordering) will 
be denoted by =j: (resp. <jp). 

Finally, we will do the following 

Assumptions (A) 

(1) Every constructor is positive. 

(2) No left-hand side of rule is headed by a constructor. 

(3) Both and >jp are well-founded. 

(4) statf = statg whenever / =j: g. 



The first assumption comes from the fact that, from non-positive inductive 
types, it is possible to build non-terminating terms [19]. The second assumption 
ensures that if a constructor-headed term is computable, then its arguments are 
computable too. The third assumption ensures that types and function defini- 
tions are not cyclic. The fourth assumption says that the arguments of equivalent 
symbols must be compared in the same way. 

For comparing the arguments, the subterm ordering < used in [5] is not 
satisfactory anymore because of the metavariables which must be applied to 
some arguments. For example, [a;]E(a;) is not a subterm of [a;]sm(E(a;)). This 
can be repaired by using the following ordering. 

Definition 5 (Covered-Subterm Ordering). We say that a metaterm u is 
a covered-subterm of a metaterm v, written u < v, if there are two positions 
p G Pos{v) and q G Pos{v\p) such that (see the figure): 

-u = V[v\pq]p, 

-\/r < p, v\r is headed by an abstraction, 

-\/r < q, v\pr is headed by a function symbol (which can be a constructor). 
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Property 6. 

(1) t> is stable by valuation: if u l> v and a is a valuation, then ua l> va. 

(2) t> is stable by substitution: if u l> v and 6 is a substitution, then u9 l> v9. 

(3) [> commutes with if u [> v and v ^ w then there is a term v' such that 

u ^ v' and v' t> w. 




Finally, we come to the definition of computable closure. 

Definition 7 (Computable Closure). Given a function symbol 
f G the computable closure CCf(l) of a metaterm f{l) is the least 

set CC such that: 

(1) if Z G Zt-i^...^tp,t is accessible in I and u are p metaterms ofCC of respective 
types t\, . . .,tp, then Z{u) G CC; 

(2) if X G Xt then x G CC; 

(3) if c G Ct n (ind u are p metaterms of CC of respective types 

t\, . . .,tp, then c{u) G CC; 

(4) if u and v are two metaterms of CC of respective types s ^ t and s then 
@{u, v) G CC; 

(5) if u G CC then [a;]u G CC; 

(6) if h G h <jp / and w are p metaterms of CC of respective types 

ti, . . .,tp, then h{w) G CC; 

(^) G 9 f and u are p> 1 metaterms of CC of respective 

types ti,. . .,tp such that u h then g{u) G CC. 

Note that we do not consider in case (7) the notion of critical interpretation 
introduced in [5] for proving the termination of function definitions over strictly 
positive types (like Brouwer’s ordinals or process algebra). 

Definition 8 (General Schema). A rewrite rule f{l) r follows the General 
Schema GS if r G CC f{l). 

A first example is given by the rule (} itself: @([a;]Z(a;), Z') Z{Z') {Z and 

Z' are both accessible). 

D{[x]sin{F{x))) [a;]@(D([y]F(y)), x) x cos(F(x)) also follows the General 

Schema since x and y belong to the computable closure of [x]sin{F{x)) by (2), 
hence F{x) and F{y) by (1) since F is accessible in [x]sin{F{x)), [y]F{y) by 
(5), D{[y]F{y)) by (7) since [y]F{y) is a strict covered-subterm of [x]sin{F{x)), 
@{D{[y]F{y)),x) by (4), cos{F{x)) by (3), @{D{[y]F{y)), x) x cos{F{x)) by (6) 
and the whole right-hand side by (5). 
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5 Termination Proof 

The termination proof follows Tait’s technique of computability predicates [26, 
9] . Computability predicates are sets of strongly normalizable terms satisfying 
appropriate conditions. For each type, we define an interpretation which is a 
computability predicate and we prove that every term is computable, i.e. it 
belongs to the interpretation of its type. For precise definitions, see [5]. 

The main things to know are: 

- Computability implies strong normalizability. 

- If u is a term of type s t, then it is computable iff, for every computable 

term v of type s, @{u, v) is computable. 

- Computability is preserved by reduction. 

- A term is neutral if it is neither constructor-headed nor an abstraction. A 

neutral term u is computable if all its immediate reducts are computable. 

- A constructor-headed term c(tt) is computable iff all the terms in u are com- 

putable. 

- For basic types, computability is equivalent to strong normalizability. 

Definition 9 (Computable Valuation). A substitution is computable if all 
the terms of its codomain are computable. A substitute A(x).u is computable if 
for any computable substitution 9 such that dom{9) C {x\, u0 is computable. 
Finally, a valuation a is computable if, for every metavariable Z, the substitute 
cr(Z) is computable. 



Lemma 10 (Compatibility of Accessibility with Computability). Ifu€ 
Acc{v) and v is computable, then for any computable substitution 6 such that 
dom{9) n FV (v) = 0, u9 is computable. 

Proof. By induction on Acc(v). Without loss of generality, we can assume 
that dom{9) C FV{u) since u9 = u9\pv(u)- 

(1) Immediate. 

(2) 0 is of the form 0' l±l {a; i-^- x0} where dom{0') C FV{v) = 0. By induction 

hypothesis, {[x\u)0' is computable. By taking x away from FV{cod{0')), 
{[x\u)9' = [x\u9' and u9 = u0'{x x0} is a reduct of @([a;]u0', a;0), hence 

it is computable since x0 is computable. 

(3) By induction hypothesis, c{u)0 = c{u0) is computable. Hence, by definition 
of the interpretation for inductive types, Ui0 is computable. 

(4) By induction hypothesis, f{u)0 = f{u0) is computable. Hence Ui0 is strongly 
normalizable, and since, for terms of basic type, computability is equivalent 
to strong normalizability, Ui0 is computable. 

(5) u must be of type s ^ t. So, let w be a computable term of type s. Since 
X ^ FV{u), X ^ dom{0). Then, let 0' = 9 'S {x w}. 0' is computable 
and dom{9') n FV{v) = 0 since x ^ FV{v). Hence, by induction hypothesis, 
@{u,x)0' = @{u0,w) is computable. 




56 



Frederic Blanqui 



(6) Since x ^ FV{u), x ^ dom{6). Then, let 6' = 6^{x [y]yi}, [y]yi being the 

i-th projection. 9' is computable and dom{6') H FV{v) = 0 since x ^ FV{v). 
Hence, by induction hypothesis, @{x, u)9' = u9) is computable and 

its /3-reduct UiO also. 

Corollary 11. Let I be a pattern, v a term and a a valuation such that la = v. 
If Z is accessible in I and v is computable, then a(Z) is computable. 

For proving Lemma 14 below, we will reason by induction on (/, u) with the 
ordering ^ = (>jf, ^mui U ^statf)iex, u being strongly normalizable arguments 
of /. Since O commutes with we can prove that ^^tatf^mui is included into 
^^tatf where means zero or one ^muZ-step. This implies that 

-^mui U ^statf is well-founded since: 

Lemma 12. If a and b are two well-founded relations such that ab C b*a then 
aUb is well-founded. 

Therefore the strict ordering associated to ^ is well-founded since >;r is 
assumed to be well-founded. Now, we can prove the correctness of the computable 
closure. 

Lemma 13 (Computable Closure Correctness). Let f{l) be a pattern. As- 
sume that a is a computable valuation and that the terms in la are computable. 
Assume also that, for every function symbol h and sequence of computable terms 
w such that (f,la) >- {h,w), h{w) is computable. Then, for every r G CCf{l), 
ra is computable. 

Proof. The proof, by induction on CCf{V), is quite similar to the one given 
in [5] except that, now, one has to deal with valuations instead of substitutions. 
The main difference is in case (1) for metavariables. We only give this case. A 
full proof can be found in Appendix C. 

In fact, we prove that, for any computable valuation a such that FV {cod{a))C\ 
FV{r) = 0, for any computable substitution 9 such that dom{9) C FV{r) and 
for any r G CCf(l), ra9 = r9a is computable. 

(I)- = Z{v) where Z is a metavariable accessible in I and v are metaterms 
of CC. We first prove it for a special case and then for the general case. 

(a)u is a sequence of distinct bound variables, say x. Without loss of 
generality, we can assume that a{Z) = \{x).w. Then, ra9 = w9. Since 
cr is computable and dom{9) C {a:} = FV{r), w9 is computable. 

{h)ra9 is a /3-reduct of the term @{[x]Z(x)a9,va9) where x are fresh 
distinct variables. By case (la) and (5), [x]Z{x)a9 is computable and 
since, by induction hypothesis, the terms in va9 are also computable, 
ra9 is computable. 

Lemma 14 (Computability of Function Symbols). If all the rules satisfy 
the General Schema then, for every function symbol f, f{u) is computable when- 
ever the terms in u are computable. 
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Proof. If / is a constructor then this is immediate since the terms in u are 
computable by assumption. Assume now that / is a function symbol. Since f{u) 
is neutral, to prove that f{u) is computable, it suffices to prove that all its 
immediate reducts are computable. We prove this by induction on (/, u) with 
as well-founded ordering. 

Let V be an immediate reduct of f{u). v is either a head-reduct of f{u) or 
of the form f{ui , . . . , u', . . . , u„) with u' being an immediate reduct of u,. 

In the latter case, as computability predicates are stable by reduction, u- is 
computable. Hence, since {f,u\...u[.. .u„) ^ (/, u), by induction hypothesis, 
f{ui , . . . , u' , . . . , Un) is computable. 

In the former case, there is a rule f{l) — > r and a valuation a such that u = la 
and V = ra. By definition of the computable closure, and since Var(r) C Var{l), 
every metavariable occuring in r is accessible in 1. Hence, since the terms in la are 
computable, by Corollary 11, a\yar(r) is computable. Therefore, by Lemma 13, 
ra = ra\yar(r) is computable. 

Theorem 15 (Strong Normalization). LetX = {A,TZ) be a (3-IDTS satisfy- 
ing the assumptions (A). If all the rules of TZ satisfy the General Schema, then 
is strongly normalizing. 

Proof. One can easily prove that, for every term u and computable substi- 
tution 9, u6 is computable. In case where u = f{u), we conclude by Lemma 14. 
The theorem follows easily since the identity substitution is computable. 

It is possible to improve this termination result as follows. After [12], if TZ 
follows the General Schema and 7^i is a terminating set of non-duplicating^ 
first-order rewrite rules, then TZ U 7^i is also terminating. 



6 Application of the General Schema to HRSs 

We just recall what is a HRS. The reader can find precise definitions in [18]. A 
HRS is a pair {A, TZ) made of a HRS-alphabet A and a set TZ of HRS-rewrite 
rules over A. A HRS-alphabet is a triple {B, X, if) where is a set of base types, 
A is a family {Xs)s^t{B) of variables and IF is a family {Fs)s^t{B) of function 
symbols. The corresponding HRS-terms are the terms of the simply-typed A- 
calculus built over X and !F that are in yy-long /3-normal form. 

So, a HRS Ti can be seen as an IDTS {Ti) with the same symbols, the arity of 
which being determined by the maximum number of arguments they can take, 
plus the symbol @ for the application. Hence it is a /3-IDTS. In [31], van Oostrom 
and van Raamsdonk studied this translation in detail and proved: 

Lemma 16 (Van Oostrom and van Raamsdonk [31]). Let H be a HRS. 

If u V then X{u) 2i(v) where X{v) is in (3-normal form. 



^ No metavariable occurs more often in the right-hand side than in the left-hand side. 
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As a consequence, Ti is strongly normalizing if {Ti) so is. Thus, the General 
Schema can be used on (Ti) for proving the termination of Ti. In fact, it can be 
used directly on Ti if we adapt the notions of accessible subterm and computable 
closure to HRSs. See Appendix B for details. 

Theorem 17 (Strong Normalization for HRSs). Let Ti = {A, TZ) be a HRS 

satisfying the assumptions (A). If all the rules ofTZ satisfy the General Schema 
for HRSs, then — is strongly normalizing. 

Proof. This results from the fact proved in Appendix B that, if Ti follows the 
General Schema for HRSs then (Ti) follows the General Schema for IDTSs. 

7 Confluence of IDTSs 

First of all, since an IDTS is a sub-GRS, it is confluent whenever the underlying 
GRS is confluent. This is the case if it is weakly orthogonal, i.e. it is left-linear 
and all (higher-order) critical pairs are equal [29], or if it is left-linear and all 
critical pairs are development closed [30] . 

Now, one may wonder whether Nipkow’s result for local confluence of HRSs 
[18] may be applied to IDTSs. To this end, we need to interpret an IDTS as a 
HRS. This can be done in the following natural way: 

Definition 18 (Natural Translation of IDTSs into HRSs). An IDTS- 
alphabet A = {B,X,T,Z) can be naturally translated into the HRS-alphabet 
'H(A) = where: 

~ U Uo<p<n 

— *... — *Sn — *S Uo<p<n Csi,...,Sp,Sp+i — *... — >Sn — *S 

An IDTS-metaterm u is naturally translated into a HRS-term H{u) as follows: 

-n{x) = xr -H{f{u)) = {f H{u))r 

- T-L{[x]u) = Xx.H{u) - H{Z{u)) = {Z 'H(u)) 

Finally, an IDTS X = {A,TZ) is translated into the HRS H{X) = {TL{A) ,TL{TZ)) 
where Tt(TZ) = {H{1) Xl^r) \ I ^ r G TZ\. 

However, for Nipkow’s result to hold, the rewrite rules must be of base type, 
which is not necessarily the case for IDTSs. This is why, in their study of the 
relations between GRSs and HRSs [31], van Oostrom and van Raamsdonk defined 
a translation from GRSs to HRSs, also denoted by ( ), which uses a new symbol 
A for forcing the translated terms to be of base type. Furthermore, they proved 
that (1) if u V then (u) ^(x) {v), and (2) if (u) ^(x) v' then there is a 
term v such that (v) = v' and u — >x v- In fact, it is no more difficult to prove 
the same property for the translation H. As a consequence, since ( ) (resp. H) is 
injective, the (local) confluence of (X) (resp. Tf(X)) implies the (local) confluence 
of X. Thus it is possible to deduce the local confluence of X from the analysis of 
the critical pairs of (X) (resp. H{X)), and indeed, it turns out that (X) and Tf(X) 
have the “same” critical pairs (see the proof of Theorem 19 in Appendix G for 
details). Identifying X with its natural translation Tf(X), we claim that: 
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Theorem 19. If every critical pair of 2 is confluent, then 2 is locally confluent. 

It could also have been possible to consider the translation Ti' which is iden- 
tical to H but pulls down to base type the rewrite rules by taking H'{f{l) — > 
r) = (/ H{1) x) ^ V if H{r) = Xx.v with v of base type. Note that the left-hand 
side is still a pattern. Then, it is possible to prove that H{2) and H' {2) have 
also the same critical pairs. 



8 Conclusion 

In Inductive Data Type Systems (IDTSs) [5], the use of first-order matching does 
not allow to define some functions as expected, resulting in non-confluent compu- 
tations. By extending IDTS with the higher-order pattern-matching mechanism 
of Klop’s Combinatory Reduction Systems (CRSs) [16], we solved this problem 
and made clear the relation between IDTSs and CRSs: IDTSs with higher-order 
pattern-matching are simply- typed CRSs. 

We extended a decidable termination criterion defined for IDTSs with first- 
order matching and called the General Schema [5] to the case of higher-order 
pattern-matching, and we proved that a rewrite system following this schema is 
strongly-normalizing . 

We also compared this unified approach to Nipkow’s Higher-order Rewrite 
Systems (HRSs) [18]. First, we proved that the extended General Schema can 
be applied to HRSs. Second, we show how Nipkow’s higher-order critical pair 
analysis technique for proving local confluence can be applied to IDTSs. 

Now, several extensions should be considered. 

We did not take into account the interpretation defined in [5] for dealing 
with definitions over strictly positive types (like Brouwer’s ordinals or process 
algebra) . However, we expect that it can also be adapted to higher-order pattern- 
matching. 

It is also important to be able to relax the pattern condition which says that 
metavariables must be applied to distinct bound variables. But it is not clear 
how to prove the termination with Tait’s computability predicates technique 
when this condition is not satisfied. 

Another point is that some computations often need to be performed within 
some equational theories like commutativity or commutativity and associativity 
of some function symbols. It would be interesting to know if the General Schema 
technique can be adapted for dealing with such equational theories. 

Finally, one may wonder whether all these results could be establish in the 
more general framework of van Oostrom and van Raamsdonk’s Higher-Order 
Rewriting Systems (HORSs) [29, 32], under some suitable conditions over the 
substitution calculus. 
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Abstract. We propose a formalism for higher-order rewriting in de 
Bruijn notation. This notation not only is used for terms (as usually 
done in the literature) but also for metaterms, which are the syntactical 
objects used to express general higher-order rewrite systems. We give for- 
mal translations from higher-order rewriting with names to higher-order 
rewriting with de Bruijn indices, and vice-versa. These translations can 
be viewed as an interface in programming languages based on higher- 
order rewrite systems, and they are also used to show some properties, 
namely, that both formalisms are operationally equivalent, and that con- 
fluence is preserved when translating one formalism into the other. 



1 Introduction 

Higher-order (term) rewriting concerns the transformation of terms in the pres- 
ence of binding mechanisms for variables. Implementing higher-order rewriting 
requires, beforehand, taking care of a complex notion of substitution operation 
and of renaming of bound variables (a-conversion) . As a paradigmatic example, 
the /3-reduction axiom of A-calculus [1], expressed {Xx.M)N — M{x ^ N}, 
may be interpreted as: the result of executing function Xx.M over argument 
N is obtained by substituting N for all (free) occurrences of x in M. Any im- 
plementation of higher-order rewriting must include instructions for computing 
this substitution. Although from the meta-level the execution of a substitution 
is atomic, the cost of computing it highly depends on the form of the terms, spe- 
cially if unwanted variable capture conflicts must be avoided by renaming bound 
variables. De Bruijn indices take care of renaming because the representation of 
variables by indices completely eliminates unwanted capture of variables. How- 
ever, de Bruijn formalisms have only been studied for particular systems (and 
only on the term level) and no general framework of higher-order rewriting with 
indices has been proposed. We address this problem here by focusing not only 



L. Bachmair (Ed.): RTA 2000, LNCS 1833, pp. 62-79, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 




A de Bruijn Notation for Higher-Order Rewriting 



63 



on de Bruijn terms (as usually done in the literature for A-calculus [11]) but also 
on de Bruijn metaterms, which are the syntactical objects used to express any 
general higher-order rewrite system formulated in a de Bruijn context. 

Many higher-order rewrite systems (HORS) exist and work in the area is 
currently very active: CRS [14], ERS [12], CERS [13], HRS [15], the systems 
in [22] and [20]. We choose in this work to use ERS because their syntax and 
semantics are simple and natural (they allow for example to write /3-reduction in 
A-calculus as usual while CRS do not) and the correspondence between ERS and 
HRS has already been established [21]. We shall begin with (a slightly simplified 
version of) the ERS formalism, that we shall call SERS (S for simplified) and 
introduce the de Bruijn index based higher-order rewrite system SERSdb- 

Our work is the first step in the construction of a formal interpretation of 
higher-order rewriting via a first-order theory. This kind of simulation would be 
possible with the aid of explicit substitutions. Indeed, this work follows, in some 
sense, the lines of [8] which interprets higher-order formalisms/problems into 
their respective first-order ones. 

Our formalism is developed in order to be used as an interface of a program- 
ming language based on higher-order rewriting. Of course, the use of variable 
name based formalisms are necessary for humans to interact with computers in a 
user-friendly way. Clearly technical resources like de Bruijn indices and explicit 
substitutions should live behind the scene, in other words, should be implemen- 
tation concerns. Moreover, it is required of whatever is behind the scene to be 
as faithful as possible as regards the formalism it is implementing. So a key is- 
sue shall be the detailed study of the relationship between SERS and SERSdb- 
The translations we propose between them are extensions to higher-order of the 
translations studied in [11] and presented in [5]. 

As regards existing higher-order rewrite formalisms based on de Bruijn index 
notation and/or explicit substitutions to the best of the authors’ knowledge there 
are but two: Explicit CRS [3] and XRS [16]. In [3] explicit substitutions a la Ax 
[19,2] are added to the CRS formalism as a first step towards using higher-order 
rewriting with explicit substitutions for modeling the evaluation of functional 
programs in a faithful way. Since this is done in a variable name setting a- 
conversion must be dealt with as in CRS. Pagano’s XRS constitutes the first 
HORS which fuses de Bruijn index notation and explicit substitutions. It is 
presented as a generalization of the Acr.|^-calculus [6] but no connection has been 
established between XRS and well-known systems such as CRS, ERS and HRS. 
Indeed, it is not clear at all how some seemingly natural rules expressible, say, 
in the ERS formalism, may be written in an XRS. As an example, consider a 
rewrite system for logical expressions such that if imply{ei , 62 ) reduces to the 
constant true then ei logically implies 62 in classical first-order predicate logic. 
A possible rewrite rule could be: 

(imp) imply(3xyyM,\/y3xM ) — > true 
A naive attempt might consider the rewrite rule 

{impdb) imply(3yM,\/3M ) — > true 
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as a possible representation of this rule in the XRS formalism, but it does not 
have the desired effect since 3VM and V3M correspond to 3 xiyM and Vx3yM 
but \/x 3 yM and Vy3a;M are not equivalent. Note that regardless of the fact that 
XRS incorporate explicit substitutions, this problem arises already at the level 
of de Bruijn notation. Another example of interest is: 

(77) \x.{Mx ) — > M if a; is not free in M 

which is usually expressed in a de Bruijn based system with explicit substitutions 

(r/dfc) A(M1)-^ TV if M=c fV[T] 

where M =c N means that M and N are equivalent modulo the theory of 
explicit substitutions C. Neither the (imp) rule nor {rjdb) is possible in the XRS 
formalism so that they do not have in principle the same expressive power as 
ERS. We shall propose de Bruijn based HORS that will allow such rules to be 
faithfully represented. 

The main contribution of this paper is a general de Bruijn notation for higher- 
order syntax which bridges the gap between higher-order rewriting with names 
and with indices. This formalism suggests a first-order tool to implement HORS, 
which in contrast to [16] would represent all the HORS used in practice. 

The rest of the paper is organized as follows. Section 2 introduces our work 
and study scenario, the SERS formalism. The de Bruijn based formalism 
SERSdb is defined in Section 3. Section 4 takes a close-up view of the relation- 
ship, via appropriate translations, between the formalisms SERS and SERSdb- 
Also, preservation of confluence is considered. Finally, we conclude. 

By lack of space we only present here an extended abstract, and therefore 
proofs, auxiliary lemmas and standard definitions are only hinted or just omitted, 
but the interested reader will find full details in [4] . 

2 Simplified Expression Reduction Systems 

We introduce the variable name based higher-order rewrite formalism SERS . 

2.1 Metaterms and Terms 

Definition 1 (Signature). Consider the denumerable and disjoint infinite sets: 

— V = {a;i, X2, X3 , . . .} a set o/variables, arbitrary variables are denoted x,y, . . . 
^ Bmv = {017 0:2, 0:3, ■ ■ ■} a set 0/ pre-bound o-metavariables (o for object), 

denoted a, ( 3 , . . . 

~ = {«!, 02, a ), . . .} o set of pre-free o-metavariables, denoted a, ( 3 , . . . 

~ Xviv = {Xi, X2, X3, . . .} a set 0/ t-metavariables (t for term), denoted 
X,Y,Z,... 

— T = {/i, /2, /s, . . .} a set 0/ function symbols equipped with a fixed (possibly 
zero) arity, denoted f,g,h,... 

~ B — {Ai, A2, A3, . . .} a set o/binder symbols equipped with a fixed (non-zero) 
arity, denoted A, . . . 
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The union of Bmv o,nd Tmv is the set of o-metavariahles of the signature. 
When speaking of metavariables without further qualifiers we refer to o and t- 
metavariables. Since all these alphabets are ordered, given any symbol s we shall 
denote 0{s) its position in the corresponding alphabet. 

Definition 2 (Labels). A label is a finite sequence of symbols of an alphabet. 
We shall use k, I, k, . . . to denote arbitrary labels and e for the empty label. If 
s is a symbol and I is a label then the notation s € I means that the symbol s 
appears in the label I, and also, we use si to denote the new label whose head is s 
and whose tail is 1. Other notations are |^| for the length of I (number of symbols 
in 1) and a.t{l,n) for the n-th element of I assuming n < |^|. Also, if s occurs 
(at least once) in I then pos{s,l) denotes the position of the first occurrence of 
s in 1. If 9 is a function defined on the alphabet of a label I = si . . . s„, then 9{l) 
denotes the label 9{si) . . .6*(s„). In the sequel, we may use a label as a set (e.g. 
S n I denotes the intersection of a set S with the set containing the elements of 
1) if no confusion arises. A simple label is a label without repeated symbols. 

Definition 3 (Pre-metaterms). The set of SERS pre-metaterms^, denoted 
VA4T, is defined by: 

M ::= a\a\ X \ f{M,...,M) \fa.{M,...,M) \ M[a^ M] 

Arities are supposed to be respected and we shall use M, N, Mi, ... to denote 
pre-metaterms. The symbol ^ .] in the pre-metaterm M[a ^ M] is called 
metasubstitution operator. The o-metavariable a in a pre-metaterm of the form 
fa.{M, . . .,M) or M[a ^ M] is referred to as the formal parameter. The set 
of binder symbols together with the metasubstitution operator are called binder 
operators, thus the metasubstitution operator is a binder operator (since it has 
binding power) but not a binder symbol since it is not an element of B. 

A pre-metaterm M has an associated tree, denoted tree{M), defined as ex- 
pected. In the case of the metasubstitution operator we have: if T\,T 2 are the 
trees of Mi, M 2 , then the tree of Mi[a ^ M 2 ] has root “sub”, and sons “[]a” 
(with son Ti) and T 2 . 

A position is a label over the alphabet IN. Given a pre-metaterm N appearing 
in M, the set of occurrences of Af in M is the set of positions of tree{M) where 
N occurs (positions in trees are defined as usual). The parameter path of an 
occurrence p in a tree T is the list containing all the (pre-bound) o-metavariables 
occuring in the path from p to the root of T. 

^ The main difference between SERS and ERS is that in the latter binders and 
metasubstitutions are defined on multiple o-metavariables. Indeed, pre-metaterms 
like ^ai . . . afc.(Mi, . . . , Mm) and M[ai . . . Ofc ^ Mi, . . . , Mk] are possible in ERS, 
with the underlying hypothesis that ai . . . Ofc are all distinct and with the underlying 
semantics that M[ai . . . Ofc ^ Mi, . . . , Mk] denotes usual (parallel) substitution. It 
is well known that multiple substitution can be simulated by simple substitution. 
Furthermore, there is also a notion of scope indicator in ERS, used to express in 
which arguments the variables are bound. Scope indicators shall not be considered 
in SERS since they do not seem to contribute to the expressive power of ERS. 
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The following definition introduces the set of metaterms, which are pre- 
metaterms that are well-formed in the sense that all the formal parameters 
appearing in the same path of a pre-metaterm must be different and all the 
metavariables in Bmv only occur bound. 

Definition 4 (Metaterms). A pre-metaterm M is a metaterm, denoted by 
iff the predicate WiFe{M) holds, where WTi{M) is defined as follows: 

— iffael 

— y\AFi{a) and WTi{X) are always true 

— WiFi{f{Mi , . . . , Mn)) iff for all 1 < i < n we have WiFi{Mi) 

— WiFilfafMi, . . Mn)) iff a ^ I and for all 1 < i < n we have WT ai{Mi) 

— WXi{Mi[a ^ M 2 ]) iff ail and WXi{M 2 ) and WXai{Mi). 

For example, f{fa.{X),\a.{Y)), f{P,Xa.{Y)) and g{Xafff3.{h))) are meta- 
terms, while the pre-metaterms f{a,fa.{X)) and f{f3,Xa.{fa.{X))) are not. 

In the sequel, pre-bound (free) o-metavariables occurring in metaterms shall 
simply be referred to as bound (free) o-metavariables. As we shall see, metaterms 
are used to specify rewrite rules. 

Definition 5 (Ftee Metavariables of Pre- metaterms). Let M he a pre- 
metaterm, then FMVar{M) denotes the set of free metavariables of M , which is 
defined as follows: 

FMVar{X) = {AT} FMVar{a) = {a} FMVar{a) = {S} 
FMVar{f{Mi , . . . , M„)) Ur=i FMVar(Mi) 

FMVar{fa.{Mi, Mn)) = {[J7=i FMVar{M,)) \ {«} 

FMVar{Mi [a ^ M 2 ]) = (FMVar{Mi) \ {a}) U FMFar (M 2 ) 

All metavariables which are not free are hound. We use BMVar{M) to denote 
the hound metavariables of a metaterm M . Note that only o-metavariahles may 
occur hound in a metaterm. We denote the set of metavariables of a metaterm or 
a pre-metaterm M by MVar{M). Note that if M is a metaterm, then FMVar{M) 
does not contain pre-bound o-metavariahles. 

Definition 6 (Terms and Contexts). The set of SERS terms, denoted T , 
and contexts are defined by: 

Terms t ::= x ] fft, . . . ,t) ] £,x.ft, . . .,t) 

Contexts C ::= □ | f{t, . . . ,C, . . .,t) ] fx.{t, . . .,C, . . .,t) 

where □ denotes a “hole”. We shall use s,t,ti, . . . for terms and C,D for con- 
texts. We remark that in contrast to other formalisms dealing with higher-order 
rewriting, here the set of terms is not contained in the set of pre-metaterms since 
the set of variables and the set of o-metavariahles are disjoint. The set of free 
(resp. hound) variables of a term t, denoted FV{f) (resp BV{t)) are defined as 
usual. 

With C[f] we denote the term obtained by replacing the term t for the hole 
□ in the context C. Note that this operation may introduce variable capture. We 
define the label of a context as a sequence of variables as follows: 
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label(D) = e 

label(/(^i , . . . , C, . . . , tn)) label(C') 
label(^a;.(ti , . . . ,C, . . tn)) '= label(C')x 

For example, the label of the context C = f{\x.{z,^y.{h{y, □)))) is the se- 
quence yx. The label of a context is a notion analogous to that of a parameter 
path of an occurrence, but defined for terms instead of pre-metaterms and where 
the only occurrence considered is that of the hole. 



Definition 7 ((Restricted) Substitution of Terms). The (restricted) sub- 
stitution of a term t for a variable x in a term s, denoted s{a; ^ t}, is defined: 



x{x <— 


t} 




def ^ 






y{x ^ 


t} 




def 

= y 






f{si,.. 


• 5 ^ 


■ t} 


= f{s,{x^ 




t}) 


fxfsi, 




^t} 


= ^x.{su.. 


■ 5 ^n) 






■ ..,Sn){x ■ 


-0 


= fy.{s,{x 


^ t}, . ..,Sn{x 4- 


-t}) 








ifx ^ y. 


and {y ^ FV (t) 


or X ^ FV{s)) 



if x^y 



Thus ^ .} denotes the substitution operator on terms but it may not 
apply a-conversion (renaming of bound variables) in order to avoid unwanted 
variable captures. Therefore this notion of substitution is not defined for all 
terms (hence its name). When defining the notion of reduction relation on terms 
induced by rewrite rules we shall take a-conversion into consideration. We may 
define a-conversion on terms as the smallest reflexive, symmetric and transitive 
relation closed by contexts verifying the following equality: 

(a) ^x.(si, . . . , s„) =a ^y.(si{a; ^ y}, . . . , Sn{x ^ y}) y not in si, . . . , s„ 
Note that since y does not occur in si, . . ., s„ substitution is defined. We shall 
use t =a s to denote that the terms t and s are a-convertible. This conversion 
is sound in the sense that t =a s implies FV ft) = FV{s). 

The notion of a-conversion for terms has a symmetrical one for pre-metaterms 
which we call v-equivalence fv for variant). The intuitive meaning of two v- 
equivalent pre-metaterms is that they are able to receive the same set of po- 
tential “valuations” (c.f. Definition 10). Thus for example, as one would expect, 
Aa.(X) Xf3.{X) because when a and X are replaced by x and ft is replaced 
by y, one obtains Xx.(x) and Ay. (a;), which are not a-convertible. However, 
since pre-metaterms contain t-metavariables, the notion of u-equivalence is not 
straightforward as the notion of a-conversion in the case of terms. 



Definition 8 (v-Equivalence for Pre- metaterms). Given pre-metaterms M 
and N, we say that M is v-equivalent to N, iff M =„ N where =y is the smallest 
reflexive, symmetric and transitive relation closed by metacontexts^ verifying: 
(vl) ^a.(Fi, . . .,Pn) =v <a^/?> ...Pn <a^/?>) 

(v2) Pi[a^Po] =„ Pi <a^/3> [/3 ^ Po] 

^ Metacontexts are defined analogously to contexts. The notion of “label of a context” 
is extended to metacontexts as expected. 
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where (3 is a pre-bound o-metavariable which does not occur in Pi,..., Pn in 
(wl) and does not occur in Pi in (v2), Pi does not contain t-metavariables for 
1 <i <n, and P is the restricted substitution for pre-metaterms: 



a <Cq;^ 


g> 


=^g 




a' 


Q> 


def / / / 

= a a 




a' <Ca^ 


Q> 


def 

= a' 






Q> 


X 




/(Ml,.. 


.,M„) 


= /(Ml <Ca^g>, . . ., M„ <Ca^ 


Q>) 


{fa.(Mi, 


...,M„)) <a^g> 


= fa.{Mi,...,MrO 




{fa' .{Ml, . . . , M„)) 


= fa'.{Mi <a^g>,...,M„ <a^g>) 
a^a', {a' ^FMVar{Q) or a^FMVar{P)) 


(Ml [a ^ 


- M2]) <a^g> 


= Mi[a^M 2 <a^g>] 




(Ml [a' ^ 


- M2]) <a^g> 


= (Ml <Ca ^ g ») [a' ^ M2 <Ca ^ 


-Q>] 



a^a', {a' ^ FMVar{Q) or a^FMVar{Mi)) 



Example 1. Aa.(a) =„ Xf3.{f3), Xa.{f) =y Xf3.{f), but Ao;.(X) X(3.{X), 

X!3.{Xa.{X)) Xa.{X!3.{X)). 

2.2 Reduction 

Whereas the rewrite rules are specified by using metaterms, the reduction rela- 
tion is defined on terms. 

Definition 9 {SERB Rewrite Rule). An SERB rewrite rule is a pair of meta- 
terms {G,D) (also written G — > D) such that 

— the first symbol in G is a function symbol or a binder symbol 

— FMVar{D) C FMVar{G) 

— G contains no occurrence of the metasubstitution operator 



Example 2. The Ax-calculus [3,19] is defined by the following BERB rewrite 
rules: 



@(Aa.(X),Z) 


^ Beta 


X{aa.{X),Z) 


X{aa.{@{X,Y)),Z) - 


^ App 


@(A:(cra.(A:), Z), X{aa.{Y), Z)) 


X{aa.{XI3.{X)),Z) - 


^Lambda 


X(3.{X{aa.{X),Z)) 


E{aa.{a), Z) — 


^ Varl 


Z 


E{aa.{j3),Z) 


^ Var2 


d 



Note that our formalism allows us to specify the Var2 rule as originally 
done in [19], while formalisms such as CRB force one to change this rule to a 
stronger one, called gc, written as E{aa.{X), Z) — >gy X, where the admissibility 
condition on valuations guarantees that if X/t is part of the valuation 9, then 
9{a) cannot be in FV{t). 
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Example 3. The AZ\-calculus [18] is defined by the following SERS rewrite rules: 



@{Aa.{X),Z) 


^ Beta 


A[a ^ Z] 


@lAa.{X),Z) 


^A1 


A/3.(A[a^ A7.(@(/3,@(7, Z)))]) 


Z\Qf.(@(a;, A)) — 


^A2 


X 


Aa.(@(a,(A/3.(@(a,A))))) - 


^A3 


X 



Definition 10 (Valuation). A variable assignment is a (partial) function 9y 
from o-metavariahles to variables withjinite domain, such that for every pair of 
o-metavariables a, j3 we have O^a ^ 6^(3 (pre-bound and pre-free o-metavariables 
are assigned different values). 

A valuation 9 is a pair of (partial) functions (9v,9t) where 9y is a vari- 
able assignment and 9t maps t-metavariables to terms. We write Dom{9) for 
Dom{9y) U Dom{9tY . A valuation 9 may be extended in a unique way to the set 
of pre-metaterms M such that MVar{M) C Dom{9) as follows: 



9a 


def ^ 




,M„) 


= f(9Mi,...,9Mr,) 


9a 


def ^ ^ 


9{fa.{Mi, 


...,M„)) 


=U9„a.(9Mi,...,9Mr^) 


9X 


III 


9{Mi[a ^ 


M 2 ]) 


= 9{Mi){9ya^9M2} 



We shall not distinguish between 9 and 9 if no ambiguities arise. Also, we 
sometimes write 9{M) thereby implicitly assuming that MVar{M) C Dom{9). 

Returning to the intuition behind v -equivalence the idea is that it can be 
translated into a-conversion in the sense that M =y N implies 9M =„ 9N 
for any valuation 9 such that 9M and 9N are defined. Indeed, coming back 
to Example 1 and taking 9 = {af x, (3ly, Xfx}, we have 9Xa.{a) = Ax.(x) =„ 
Ay.(y) = 9A/3.(/3), 9Aa.(f) = Ax.(f) ee„ Ay.(f) = 9A/3.(f), 9Aa.(X) = Ax.(x) 
Ay.{x) = 9A(3.{X), 9A(3.{Aa.{X)) = Ay.{Ax.{x)) Ax.{Ay.{x)) = 9Aa.{Ap.{X)). 

Definition 11 (Safe Valuations). Let M G VA4T and 9 a valuation with 
MVar{M) C Dom{9). We say that 9 is safe for M if 9M is defined. Likewise, if 
{G,D) is a rewrite rule, we say that 9 is safe for (G,D) if9D is defined. 

Note that if the notion of substitution we are dealing with were not restricted 
then a-conversion could be required in order to apply a valuation to a pre- 
metaterm. Also, for any valuation 9 and pre-metaterm M with MVar{M) C 
Dom{9) that contains no occurrences of the metasubstitution operator 9 is safe 
for M. Thus, we only ask 9 to be safe for D (not G) in the previous definition. 

The following condition is the classical notion of admissibility used in higher- 
order rewriting [21] to avoid inconsistencies in rewrite steps. 

Definition 12 (Path Condition for T- Metavariables). Let X be a t-meta- 
variable. Consider all the occurrences p\, . . . ,Pn of X in (G, D), and their respec- 
tive parameter paths li, ... ,ln in the trees corresponding to G and D. A valuation 
9 verifies the path condition for X in (G,D) if for every x G FV{9X), either 
(Vl < i < n we have x G 9k) or (Vl < i < n we have x ^ 9k). 

® As usual, Dom{ip) denotes the domain of the partial function fi. 
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This definition may be read as: one occurrence of a; G FV {OX) with X in 
(G, D) is in the scope of some binding occurrence of x iff every occurrence of 
X in (G, D) is in the scope of a bound o-metavariable a with 9a = x. For 
example, consider the SERS rule Xa.{^P-{X)) — > ^p.{X) and the valuations 
9i = {ajx, fily, Xf z} and 62 = {ajx, [3 jy, Xfx}. Then 9i verifies the path 
condition for X, but 62 does not since when instantiating the rewrite rule with 

0 2 the variable x shall occur both bound (on the LHS) and free (on the RHS). 
Definition 13 (Admissible Valuations). A valuation 9 is said to he admis- 
sible for a rewrite rule (G, D) iff 

— 9 is safe for (G, D) 

— if a and j3 occur in (G, D) with (3 then O^a ^ 9yf3 

— 9 verifies the path condition for every t-metavariahle in (G, D) 

Note that an admissible valuation is safe by definition, but a safe valuation 
may not be admissible: consider the rule Xa.app{X, a ) — > X, the valuation 9 = 
{a/x, X/x} is trivially safe but is not admissible since the path condition is not 
verified: x G 9{a) but x ^ 9{e) (x occurs bound on the LHS and free on the RHS). 

Now, there are two possible and equivalent ways to define reduction in a 
higher-order framework. One can either define reduction via a notion of substi- 
tution which makes explicit use of a-conversion, as it is usually done [10], or, as 
it is done here, reduction is explicitly defined as reduction modulo a-conversion 
and using a notion of restricted substitution which does not make use of a- 
conversion. We choose this second (and more involved) approach since we prefer 
to have a notion of reduction on terms in both formalisms (with names and de 
Bruijn indices), which is similar enough to make technical proofs work easily. 

Definition 14 (Reduction on Terms). Let TZ he a set of SERS rewrite rules 
and s, t terms. We say that s 7^-reduces to t, written s — t, iff there exists a 
rewrite rule (G, D) G TZ, an admissible valuation 9 for (G, D) and a context C 
such that s =a C[9G] and t =a C[9D], 

3 Simplified Expression Reduction Systems with Indices 

We introduce de Bruijn indices based higher-order rewrite formalism SERSdb- 

3.1 De Bruijn Metaterms and Terms 

A classical way to avoid a-conversion is to use de Bruijn index notation [7], 
where names of variables are replaced by natural numbers. When talking about 
a set N of de Bruijn indices we may refer to Names (A) as the set of names of N 
given by the order on the set of variables V introduced in Section 2. Indeed, if 
N = {m, . . . , Urn}, then Names(A) = {x^, ■ ■ ■ , Xn^}- 

In the sequel, in order to distinguish a concept defined for the SERS formal- 
ism from its corresponding version (if it exists) in the SERSdb formalism we 
may prefix it using the qualifying term “de Bruijn”, eg. “de Bruijn metaterms”. 

Definition 15 (de Bruijn Signature). Consider the denumerable and disjoint 
infinite sets: 
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— {ai, a 2 , « 3 , . . .} a set of symbols called binder indicators, denoted a, /3, . . 

— Xmv = {oij ^ 2 , . . .} a set of i-metavariables (i for index), denoted a, (3, . . 

~ Xmv = {^i 1 1 T ■ ■ ■} ® set o/ t-metavariables (t for term), where I 

ranges over the set of labels built over binder indicators, denoted Xi,Yi, Zi, . . ., 

— T = {/i, fi, /a, . . .} a set o/ function symbols equipped with a fixed (possibly 

zero) arity, denoted f, g,h, . . ., 

— B = {Ai, A 2 , A 3 , . . .} a set of binder symbols equipped with a fixed (non-zero) 

arity, denoted A, /i, v,f, ■ ■ ■■ 

We remark that the set of binder indicators is exactly the set of pre-bound o- 
metavariables introduced in Definition 1. The reason for using the same alphabet 
in both formalisms shall become clear in Section 4, but intuitively, we need a 
mechanism to annotate binding paths in the de Bruijn setting to distinguish 
metaterms like ^p.{fa.{X)) and ^a.(^/3.(X)) appearing in the same rule when 
translated into an SERSdb system. 

Definition 16 (de Bruijn Pre-metaterms). The set of de Bruijn pre-meta- 
terms, denoted VXiTdb, is defined by the following two-sorted grammar: 

metaindices I ::= 1 | S(/) | d 

pre-metaterms A ::= / j X/ | f{A, . . . ,A) \ . . . ,A) \ A|A] 

The symbol .|.] in a pre-metaterm is called de Bruijn metasubstitution 

operator. The binder symbols together with the de Bruijn metasubstitution oper- 
ator are called binder operators, and the same remark of Definition 3 applies. 

We shall use A, B, Ai , ... to denote de Bruijn pre-metaterms and the con- 
vention that S°(l) = 1, S°(S) = d and S^+^(n) = S(S^(n)). As usually done for 
indices, we shall abbreviate as j. 

Even if the formal mechanism used to translate pre-metaterms with names 
into pre-metaterms with de Bruijn indices will be given in Section 4, let us intro- 
duce intuitively some ideas in order to justify the syntax used for i-metavariables. 
In the formalism SERS there is a clear distinction between free and bound o- 
metavariables. This fact must also be reflected in the formalism SERSdb, where 
bound o-metavariables are represented with indices and free o-metavariables 
are represented with i-metavariables (this distinction between free and bound 
variables is also used in some formalizations of A-calculus [17]). However, free 
variables in SERSdb appear always in a binding context, so that a de Bruijn val- 
uation of such kind of variables has to reflect the adjustment needed to represent 
the same variables but in a different context. This can be done by surrounding 
the i-metavariable by as many operators S as necessary. As an example consider 
the pre-metaterm fa.{(3). If we translate it to (,{(3), then a de Bruijn valuation 
like K = 1/3/1} binds the variable whereas this is completely impossible in the 
name formalism thanks to the conditions imposed on a name valuation (c.f. con- 
dition on variable assignments in Definition 10). Our solution is then to translate 
the pre-metaterm fa. {(3) by ^(S(/3)) in such a way that there is no capture of 
variables since k(^(S(/ 3))) is exactly ^(2). The solution adopted here is in some 
sense what is called pre-cooking in [9]. 
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We use MVar{A) (resp. MVari{A) and MVart{A)) to denote the set of all 
metavariables (resp. i- and t-metavariables) of the de Bruijn pre-metaterm A. 

As in the SERB formalism, we also need here a notion of well-formed pre- 
metaterm. The first motivation is to guarantee that labels of t-metavariables 
are correct w.r.t the context in which they appear, the second one is to ensure 
that indices like S*(l) (resp. S®(S)) correspond to bound (resp. free) variables. 
Indeed, the pre-metaterms .C(?(4)) and ^(S) shall not make sense for us, 

and hence shall not be considered well-formed. 

Definition 17 (de Bruijn Metaterms). A pre-metaterm A G VAiTdb is said 
to be a metaterm iff the predicate WT{A) holds, where WT{A) iffWTffA), and 
WTi{A) is defined as follows: 

— WEiiSffl)) tffj + l<\l\ 

— WEiiSffa)) iffj = \l\ 

— WTi{Xk) iff I = k and I is a simple label 

— y\EFilf{Ai , . . . , An)) iff for all 1 < i < n we have WiFfAi) 

— yV’lF/(^(Ai , . . . , An)) iff there exists a ^ I such that for all \ <i<n we have 
WTn,l{Ai) 

— yV’lF/(Ai IA 2 ]) iffWTi{A 2 ) and there exists a ^ I such that WTafAi) 

Therefore indices of the form S^(l) may only occur in metaterms if they 
represent bound variables and well-formed metaindices of the form Sffa) always 
represent a free variable. Note that when considering WTfM) and WTfA) it 
is Definitions 4 and 17 which are referenced, respectively. 

Example 4- Pre-metaterms ^(Xq, A(F/ 3 a, S(l))), f {(3, XfYa, S{a))), g{X{f{h))) are 
metaterms, while /(S(S), ^(X,g)), A(^(Xaa)), /(^, A(^(S(^)))) are not. 



Definition 18 (de Bruijn Terms and de Bruijn Contexts). The set of de 

Bruijn terms, denoted Tdb, and the set of de Bruijn contexts are defined by: 
de Bruijn indices n ::= 1 | S(n) 
de Bruijn terms a ::= n \ f{a, . . . ,a) | ^(a , . . . , a) 
de Bruijn contexts E ::= □ | /(a, . . . ,E, . . ,,a) | ^(a, . . . , if, . . . , a) 

We use a, b, ai, bi, . . . for de Bruijn terms and E,F, . . . for de Bruijn contexts. 
We may refer to the binder path number of a context, which is the number of 
binders between the □ and the root. 

We use FV{a) to denote the set of free variables (indices) in a; the result of 
substituting a term b for the index n > 1 in a term a is denoted <— the 
updating functions are denoted ii"(.) for i > 0 and n > 1. All these concepts 
are defined as usual. 



Definition 19 (Free de Bruijn Metavariables). Let A be a de Bruijn pre- 
metaterm. The set of free metavariables of A, FMVar{A), is defined as: 

m FMVar{f{A,,...,An)) = [X=rFMVar{A^) 

FMR 14 EMVarifiAi, ■ ■ ■ , An)) = FMVar{A,) 

tMVar[a) = {af def 

FMVar{X) {X } FMVar{AilA 2 l) = FMVar{A^)UFMVar{A 2 ) 
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Note that this definition also applies to de Bruijn metaterms. The set of 
names of free metavariables of A is the set of free metavariables of A where each 
Xi is replaced simply by X. This notion will be used in Definition 20. 

3.2 Reduction 

We define rewrite rules, valuations, their validity, and reduction in SERSdb- 

Definition 20 (de Bruijn Rewrite Rule). A de Bruijn rewrite rule is a pair 
of de Bruijn metaterms {L,R) (also written L — > R) such that 

— the first symbol in L is a function symbol or a binder symbol 

— the set of names of FMVar{R) is included in the set of names of FMVar{L) 

— the metasubstitution operator does not occur in L 



Definition 21 (de Bruijn Valuation). A de Bruijn valuation k is a pair 
of (partial) functions (Ki,Kt) where Ki is a function from i-metavariables to 
integers, and Kt is a function from t-metavariables to de Bruijn terms. We denote 
by Dom{K) the set Dom(Ki) U Dom(Kt). A valuation k determines in a unique 
way a function k from the set of pre-metaterms A with FMVar(A) C Dom^n) 
to the set of terms as follows: 



k1 


def 


1 


^S(I) 


def 


3(77/) 




def 




Ka 


= 


Kia 


KXi 


def 


KtXi 



Kf{Ai ,..., An) = finAi ,..., KAn) 

Kf{Ai,...,An) f{KAi,..., KAn) 

77(Ai[A2l) =K{A,)ll^KA2l 



Note that in the above definition the substitution operator ^ .J- refers 
to the usual substitution defined on terms with de Bruijn indices. 

We now introduce the notion of value function which is used to give seman- 
tics to metavariables with labels in the SERSdb formalism. The goal pursued by 
the labels of metavariables is that of incorporating “context” information as a 
defining part of a metavariable. As a consequence, we must verify that the terms 
substituted for every occurrence of a fixed metavariable coincide “modulo” their 
corresponding context. Dealing with such notion of “coherence” of substitutions 
in a de Bruijn formalism is also present in other formalisms but in a more re- 
stricted form. Thus for example, as mentioned before, a pre-cooking function 
is used in [9] in order to avoid variable capture in the higher-order unification 
procedure. In XRS [16] the notions of binding arity and pseudo-binding arity 
are introduced in order to take into account the parameter path of the different 
occurrences of t-metavariables appearing in a rewrite rule. Our notion of “co- 
herence” is implemented with valid valuations (cf. Definition 23) and it turns 
out to be more general than the solutions proposed in [9] and [16]. 



Definition 22 (Value Function). Let a G Tdb and I be a label of binder indi- 
cators. Then we define the value function Value{l,a) as Value^{l, a) where 
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( n if n < i 


Value'‘{l, n) 




def 


< at(l, n — i) ifO<n — i< |^| 








A 

1 

T 

T 

e 


Value'' {1, /(oi, . . 


• 5 <^n)) 


def 


/( Value'{l, oi), . . . Value'{l, a„)) 


Value' {l,^{ai, . . 


■ .an)) 


def 


f{Value'~^^{l, oi), . . . , Value'~^^ {1, a„)) 



It is worth noting that Value^{l,n) may give three different kinds of re- 
sults. This is just a technical trick to make easier later proofs. Indeed, we 
have for example Value{af3, 1))) = ^(/(/3, 1)) = Value{f3a, 1))) and 

Valueie, /(^(l), A(2))) = f^), A(xi)) ^ /(^(l), A(a)) = Value{a, /(^(l), A(2))). 
Thus the function Value{l, a) interprets the de Bruijn term a in an ^-context: 
bound indices are left untouched, free indices referring to the ^-context are re- 
placed by the corresponding binder indicator and the remaining free indices are 
replaced by their corresponding variable names. 

In order to introduce the notion of valid de Bruijn valuations let us consider 
the following rule: 

Even if translation of rewrite rules into de Bruijn rewrite rules has not been 
defined yet (Section 4), one may guess that a reasonable translation would be 
the following rule (called roB)' 

>rDB 

which indicates that (3 (resp. a) is the first bound occurrence in the LHS (resp. 
RHS) while a (resp. (3) is the second bound occurrence in the LHS (resp. RHS). 
Now, if X is instantiated by x, a by a; and /3 by y in the SEES system, then we 
have a r-reduction step ^x.{^y.{x )) — > ^y.{^x.{x)). However, to refiect this fact 
in the corresponding SERSdb system we need to instantiate Xf^a by 2 and Xafj 
by 1, thus obtaining a r^iB-reduction step ^(^(2)) — > ^(^(1)). This clearly shows 
that de Bruijn t-metavariables having the same name but different label cannot 
be instantiated arbitrarily as they have to refiect the renaming of variables which 
is indicated by their labels. This is exactly the role of the property of validity: 

Definition 23 (Valid de Bruijn Valuation). A de Bruijn valuation k is said 
to be valid if for every pair of t-metavariables Xi and Xr in Dom^n) we have 
Value{l, nXi) = Value{l' , nXu). Likewise, we say that a de Bruijn valuation k 
is valid for a rewrite rule (L, R) if for every pair of t-metavariables Xi and Xr 
in {L,R) we have Value{l, nXi) = Value{l' , kXu). 

It is interesting to note that there is no concept analogous to safeness (cf. 
Definition 11) as used for named SEES due to the use of de Bruijn indices. Also, 
the last condition in the definition of an admissible valuation (cf. Definition 13) 
is subsumed by the above Definition 23 in the setting of SERSdb- 

Example 5. Returning to the example just after Definition 22 we have that k = 
{-^/3q/2, Xap/l} is valid since Value{(3a, 2) = a = Value{a(3, 1). 
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Another interesting example is the well-known /^-contraction rule Aa;.@(A, x)- 
— > X if X ^ FV{X). It can be expressed in the SERS formalism as the rule 
(? 7 „) Aa.@(A, Of) — > X, and in the SERSdb formalism as the rule {ridb) - 
A(@(X„,1))^ A,. 

Remark that this kind of rule cannot be expressed in the XRS formalism [16] 
since it does not verify the binding arity condition. Our formalism allows us to 
write rules like rjdb because valid valuations will test for coherence of values. 
Indeed, an admissible valuation for rjn is a valuation 9 such that 9X does not 
contain a free occurrence of 6{a). This is exactly the condition used in any usual 
formalization of the //-rule. 

Definition 24 (Reduction on de Bruijn Terms). Let TZ be a set of de Bruijn 
rules and a, b de Bruijn terms. We say that a 7^-reduces to b, written a — b, 
iff there is a de Bruijn rule (L, R) € TZ and a de Bruijn valuation k valid for 
{L, R) such that a = E[kL] and b = E[kR\, where E is a de Bruijn context. 

Thus, the term \{app{X{app{l, 3)), 1)) rewrites by the ryu, rule to X{app{l, 2)), 
using the (valid) valuation k = {Xa/ X{app{l,3), X^/ X{app{l,2))}. 

4 Relating SERS and SERSdb 

In this section we show how reduction in the SERS formalism may be simulated 
in the SERSdb formalism and vice-versa. 

Definition 25 (From Terms (and Contexts) to de Bruijn Terms (and 
Contexts)). The translation of a term t, denoted T(t), is defined as T^(t) where 







def J pos(x, k) if X G k 
\ 0(x) + fc if X ^ k 


Tkifiti,-- 




‘"='/(Tfc(ti),...,rfc(t„)) 


Tki^x.iti, 


...,t„)) 





The translation of a context, denoted T{C), adds the clause T),(tll) = □. 

Definition 26 (From Pre- metaterms to de Bruijn Pre- metaterms). The 

translation of a pre-metaterm M , denoted T{M), is defined as T^{M) where: 

T,{a) = pos(a,fc), tfa&k T,(/(Mi , . . . , M„)) 1^' /(T,(Mi), . . . , T,(M„)) 

T,(S) sl'=l(S) T,(ea.(Mi, . . .,M„)) ?(T„,(MH.....T„,(M„)) 

Tfc(A)‘'=l'Afc Tfc(Mi[a^M2]) r„fc(Mi)lTfc(M2)l 

Note that if M is a metaterm, then T{M) will be a de Bruijn metaterm 
and only have t-metavariables with simple labels. Note also that, for some pre- 
metaterms, such as fa. ((3), the translation T{.) is not defined. 

Lemma 1 (T Preserves Well-Formedness). If M is a metaterm, then T{M) 
is a de Bruijn metaterm. 
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Definition 27 (Prom SERS Rewrite Rules to SERSdb Rewrite Rules). 

Let {G, D) he a rewrite rule in the SERS formalism. Then T{G, D) denotes the 
translation of the rewrite rule, defined as (T{G),T(D)). 

As an immediate consequence of Lemma 1 and Definition 27, if (G, D) is an 
SERS rewrite rule, then T{G,D) is an SERSdb rewrite rule. 

Example 6. Following Example 2, the specification of Ax in the SERSdb formal- 
ism is given below. 

@(A(X„),Z,) ^ E{a{X^),Z,) 

E{a{@{X^,Y^)),Z,) @(E{a{X^),Z,),E{a{Y^),Z,)) 

X{a{\{X/ 3 c,)),Z,) 

E{a{l),Z,) 

S{a{S0)),Z,) 

The rule E{cr{\{X/Sa)), Zf ) — > \{S{a{Xap), Zp)) is interesting since it illus- 
trates the use of binder commutation from XjSa to Xap and shows how some 
index adjustment shall be necessary when going from to Z^. 

Example 7. The translation of the AZ\-calculus (Example 3) yields the following 
rewrite rules in the SERSdb formalism 

@(A(X„),Z,) ^ X^lZ.l 

@{A{X^),Z,) Z1(X„4A(@(S(1),@(1,Z^0)))1) 

Z\(@(1,X„)) ^ X, 

Z1(@(1,(Z\(@(S(1),X^„))))) ^ X, 

We remark that the translation of Z\i, Z \2 and A 3 would not be possible in 
XRS [16]. 

Proposition 1 (Simulating SERS Reduction via SERSdb Reduction). 

Suppose s — > t in the SERS formalism using the rewrite rule (G,D). Then we 
have T{s ) — > T{t) in the SERSdb formalism using the rule T{G,D). 

We now consider how reduction in SERSdb may be simulated in SERS . 

Definition 28 (From de Bruijn Terms (Contexts) to Terms (Contexts)). 

We define the translation of a G Tdb , denoted U{a), as where, 

for every finite set of variables S, and label of variables k, U^{a) is defined by: 

fjSu) / at(fc,n) ifn< \k\ 

^ \a;n-|fc| if n > \k\ and Xn-\k\ G S 

C/f (/(ai, . . . , a„)) /(C/f (ai), . . . , C/f (a„)) 

C/f (C(ai, . . .,a„)) =‘'^a;.(C/ffc(ai),...,C/ffc(a„)) foranyx^kUS 

The translation of a de Bruijn context E, denoted U{E), is defined as above 
but adding the clause U^{0) □. ITe remark that we can always choose x ^ 

kyj S since both k and S are finite. 



\{E{a{X^p),Z^)) 

Z, 

P 
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Note that U{.) is not a function in the sense that the choice of bound variables 
is non-deterministic. However, if t and t' belong both to 17(a), then t =„ t' . Thus, 
U{.) can be seen as a function from de Bruijn terms to a-equi valence classes. 
Definition 29 (Prom de Bruijn Pre- metaterms to Pre- metaterms). The 
translation of a de Bruijn pre-metaterm A, denoted U{A), is defined as Ue{A), 
where Ui{A) is defined as follows: 



Uiis^a)) 

Ui{sW{a)) 

Ui{Xi) 

Ui{f{Au...,Ar,)) 

UiiaAi,...,A„)) 



def / 1 . w \ 

= at(/, z -I- 1) 

def ^ 

= a 



ifi-\-l<\l\ 



f{Ui{A,),...,Ui{A^)) 

= fa.{UM),...,Ua,i{An)) 

if ^ i ^ n WTai{Ai) for some a ^ I 



Ui{A4A2j) = UUAi)[a ^ Ui{A2)] 

if WiFai{Ai) for some a ^ I 

As in Definition 28 we remark that the translation of a de Bruijn pre- 
metaterm is not a function since it depends on the choice of the names for 
o- metavariables. Indeed, two different pre-metaterms obtained by this transla- 
tion will be w-equivalent. Also, for some de Bruijn pre-metaterms such as ^(2), the 
translation may not be defined. However, it is defined on de Bruijn metaterms. 



Definition 30 (Prom SERSdb Rewrite Rules to SERS Rewrite Rules). 

Let {L,R) he a de Bruijn rewrite rule then its translation, denoted U{L,R), is 
the pair of metaterms {Ue{L), Ue{R))- 

Note that if V\EFi{A) holds then Ui{A) is also a named metaterm, that is, 
WiFi{Ui{A)) also holds. Therefore, by Definition 9 the translation of a de Bruijn 
rule is a rule in SERS. As mentioned above, if a de Bruijn pre-metaterm A is 
not a de Bruijn metaterm then Ui{A) may not be defined. 



Example 8. Consider the rule @(A(Aq), Z^) — > A(Aq^|A(@(S( 1), @(1, Z.y/ 3 )))]) 
from Example 7. The translation in Definition 29 yields the rule @(Aa.(A), Z)- 
— > A/3.(X[a ^ @( 7 , Z)))]) and the translation in Definition 29 on the 

rule A(cr(S(/3)), Z^) — > P yields X{aj.(P), Z) — > P for some bound metavariable 

7- 



Proposition 2 (Simulating SERSdb Reduction via SERS Reduction). 

Suppose a — > b in the SERSdb formalism using rewrite rule (L,R). Then we 
have U{a ) — > U{b) in the SERS formalism using rule U{L,R). 

As regards the relationship between the translations over pre-metaterms and 
terms introduced above we may obtain two results stating, respectively, that 
given a metaterm M then U{T{M)) is w-equi valent to M and that given a de 
Bruijn metaterm A then T{U{A)) is identical to A. These results are used to 
show that confluence is preserved when translating in both directions. 
Theorem 1 (Preservation of Confluence). 

1. IfTZisa confluent SERS then T(JZ) is a confluent SERSdb- 

2. IfTZisa confluent SERSdb then U{TZ) is a confluent SERS. 
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5 Conclusions 

We have proposed a formalism for higher-order rewriting with de Bruijn nota- 
tion and we have shown that rewriting with names and rewriting with indices are 
semantically equivalent. We have given formal translations from one formalism 
into the other which can be viewed as an interface in programming languages 
based on higher-order rewrite systems. This work fills the gap between classical 
presentations of higher-order rewriting with names existing in the literature and 
first-order presentations of higher-order rewriting such as [16]. Moreover, it ex- 
plicitly suggests that XRS are not sufficient to express an arbitrary higher-order 
rewrite system. 

Further ongoing work uses the formalism presented here to propose a tool for 
implementing higher-order rewrite systems via first-order ones. This tool would 
incorporate not only de Bruijn notation but also explicit substitutions in a very 
general form. 
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Abstract. This paper presents a general method for studying some quo- 
tients of the special linear group SL 2 over the integers, which are of 
fundamental interest in the field of statistical physics. Our method au- 
tomatically helps in validating some conjectures due to physicists, such 
as conjectures stating that a set of equations completely describes a h- 
nite given quotient of SL 2 . In a hrst step, we show that in the cases 
we are interested in, the usual presentation of finitely generated groups 
with some constant generators and a binary concatenation can be turned 
into an equivalent one with unary generators. In a second step, when the 
completion of the transformed set of equations terminates, we show how 
to compute directly the associated normal forms automaton. According 
to the presence of loops, we are able to decide the hniteness of the quo- 
tient, and to compute its cardinality. When the quotient is infinite, the 
automaton gives some hints on what kind of equations are needed in 
order to insure the hniteness of the quotient. 



Introduction 

In the field of statistical physics in dimension 2, conformal invariance plays a 
crucial role [9, 5, 4]. It permits to express a physical theory which is invariant 
by geometrical transformations preserving the angles. Among the possible ap- 
plications, one has to mention the study of demagnetization phenomena (Ising 
model), of polymer films and of string theory. 

In this paper, we provide some tools for studying the so called rational con- 
formal theories for which appear a finite dimensional representation of SL^ (Z) . 

The mathematical formulation of the problem is to investigate the structure 
of the infinite group SL 2 (Z) generated by two 2x2 matrices usually denoted by S 
and T. In particular, we want to study some quotients of SL 2 (Z), and determine 
whether they are finite or not, and if they are finite, whether they are quotients 
of SL 2 {Z/nZ) for some n. Rewriting techniques (see [6] for some backgound) 
are used here to study a class of finitely generated groups, quotiented by some 
additional ground equations, which are actually isomorphic to some quotients 
of SL 2 {Z). The study of finitely presented groups goes back to the end of the 

* An extended version with complete proofs of this paper available at 
http : //www. Iri . fr/~monate/rtatp-ext .ps . gz 

L. Bachmair (Ed.): RTA 2000, LNCS 1833, pp. 80-94, 2000. 

@ Springer- Verlag Berlin Heidelberg 2000 




Rewriting Techniques in Theoretical Physics 



81 



19th century; the problem of deciding whether 2 such groups are isomorphic 
was first stated by Tietze, and the word problem by Dehn. Unfortunately, as 
many interesting problems, they have been proven to be undecidable 50 years 
later (see [13] for the historical background). However, there are some positive 
results in some sub-cases, mainly for automatic groups [8] or when Knuth-Bendix 
completion [10] terminates [7]. 

In a first step, we show that in this particular framework, the quotient alge- 
bras we are dealing with, are isomorphic to quotient algebras where the signature 
contains only one constant function symbol, and some unary function symbols, 

1. e. a monoid. This is done by a translation which eliminates the inverse func- 
tion symbol. Compared with the method of Biindgen [2], ours does not use any 
term ordering nor completion. Then, we use Knuth-Bendix completion over the 
monoid in order to get a convergent rewrite system, and eventually, in the case 
when the completion succeeds, we build an automaton which accepts the terms 
in normal form. 

We present here a specialized version of a general algorithm [12] which com- 
putes the bottom-up tree automaton directly, and not as the complement of the 
union of the automata for the instances of the left hand sides of the rewriting 
rules. In this paper, the automaton is not a fully general tree automaton, since 
we only have function symbols of arity < 1. Gilman [7] proposed a similar algo- 
rithm building a directed graph for the case of finitely presented groups which 
can be interpreted as a top-down (finite states) automaton. 

Eventually, our automaton is used to determine the cardinality of the quotient 
algebra: if it contains a loop, then there are infinitely many distinct elements in 
the algebra, otherwise, there are finitely many, and we can count them as the 
number of paths from the initial state to final states in the automaton. 

The paper is organized as follows: in the first section, we exhibit a finite group, 
P5L2(Z/5Z) and we show how our method can be used in order to prove that 
it is isomorphic to a quotient term algebra. In the following sections, we give 
the theoretical results which justify the correctness of our method; in section 

2, we show the isomorphism between two presentations of finitely generated 
groups, the first one where the generators are constant function symbols (with 
an additional binary concatenation symbol), the second one where the generators 
are unary function symbols. In section 3, we present a direct construction of a 
bottom-up tree automaton recognizing the terms which are in normal form w.r.t. 
a convergent rewrite system, where the function symbols are of arity < 1. 



1 A Motivating Example: Klein’s Dodecahedron 

In this section, we consider the finitely generated group T({-, 1, inv, S,T}) /Q U 
A/”U£, where • is the binary associative operation of the group, 1 is the unit, inv is 
the unary inverse, S and T are the generators, Q is an equational presentation of 
groups, Af = {S' • S' = 1, T • T • T • T • T = 1} and £ = {S' • T • S' • T • S' • T = 1}. 

We show how our general method can be used in order to prove that the 
above group is isomorphic to PSL2(^/5Z), that is the group of 2x2 matri- 
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ces over Z/5Z, the determinant of which is equal to 1, quotiented by the re- 
lation = ^ ^ PS'L2(Z/5Z) admits two generators S = 

T = ^ Q I ^ ) and can be seen as a dodecahedron where each edge is duplicated 

into two “oriented” edges. Right multiplying by T an element, that is an ori- 
ented edge, amounts to follow this oriented edge to the next one, keeping the 
same orientation, and right multiplying by S amounts to take the same edge 
with the converse orientation. This representation allows to “prove” by hand 
that this group is finite [5]. 

First, according to the main theorem of section 2, we have that 
T({-, 1, inv. S', r})/5 U Af U £ is isomorphic to T({1, S, T})/r(A/’ U S), where 
1 is a constant symbol, S and T are unary function symbols: 

r(A/’) = {S^(x) = X, (x) = a;}, and 
r(£) = {S(T(S(T(S(T(x)))))) = x}. ^ 

Then, we complete r(A/” U £) into the convergent rewrite system R (with 
respect to the RPO defined by the precedence S > T >1) 




S^{x) 

^{x)_ 

S{^{_S{x))) 

S(T(S_(x))) 

S(t'(S(^(x)))) 

S{T\s{T\six)m 



T{S(T{x))) 

T{S(t\s{_x)))) 

^{_S{T\_Sixm 

T{S{T\s{T{x)m 



The above system has 7 rules. Note that if we complete directly Q U Af U £, we 
get a convergent system with 26 rules; 10 rules for Q, 2 rules for rewriting inv(S) 
and inv(T), and 14 = 2x7 rules, each rule Ci[x\ — > C' 2 [a;] of the above system 
being duplicated into — > '^(C' 2 [»]), and • x — > '^(C' 2 [»]) • x, 

where x is a variable (‘^(C'i[»]) is the canonical term associated with the context 
Cil*], see definition 3 below). 

Then we use the 7 rules system for building an automaton recognizing the 
terms of T({1,S',T}) which are in normal form. Since the automaton is a re- 
striction of a tree automaton, a term as a tree, is read bottom-up, hence as a 
string, from right to left. We get the automaton given in figure 1. Eventually, 
since there are no loops in the automaton, we know that the quotient algebra is 
finite, and counting the number of distinct paths in this automaton, we get that 
the quotient algebra has 60 distinct elements. Note that the starting equalities 
AfU 5 are valid in PS'T2(Z/5Z), hence PS'L2(^/5Z) is isomorphic to a subgroup 
of T({1, S ,T}) / t{N yj £) and since we know that PS'L2(Z/5Z) has 60 elements, 
actually PS'T2(Z/5Z) is isomorphic to T({1, S', T})/r(A/’ U £). 
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Fig. 1. A normal forms automaton for the Klein’s dodecahedron. 



2 Eliminating the Inverse from a Presentation of a Group 

It is well-known folklore that the ground terms of a finitely generated monoid 
may be seen as built over a signature containing the generators considered either 
as constants or as unary function symbols. In the first case, the signature has 
to contain a binary associative symbol for concatenation, in the second case 
it has to contain a constant which is the common leaf of all ground terms. 
In this section, we generalize this view to a class of finitely generated groups, 
which are defined as quotient algebras T {T) U Af U f defined over a signature 

T = {-,1, inv, Li, . . . , Ln} where • is the associative binary concatenation, 1 is 
a constant intended to represent the unit of the group, inv is a unary function 
symbol the meaning of which is the inverse of an element in the group and 
Li,...,Ln are the constant generators. The algebra is quotiented by a set of 
equations divided into three subsets; ^ is a presentation of (non-commutative) 
groups, 

= 1 } 

Li and 8 contains only equa- 



JLn 



Af = {cy = i,...,A 

with jii > 0 for all i, where Lf denotes Li 

j times 

tions between ground terms. In the following, Q will be either Qmin a minimal 
presentation of groups, or Qc, an equivalent convergent presentation obtained 
from Qmin by completion: 



Qmin — 







' {x-y)- z 


= X - 


{yz) 






X - 1 


= X 








1 • X 


= X 




{x-y) ■ z =x-{y z) 
X • 1 = X 

X • inv(a;) = 1 


Qc = < 


X - inv(a;) 
inv(a;) • x 
X - (inv(a;) • y) 
inv(a;) - {x - y) 


= 1 
= 1 
= V 

= y 








inv(a; • y) 


= inv(y) • invi 






inv(l) 


= 1 








inv(inv(a;)) 


= X 
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We will also consider T(lF), the algebra built over the ’’unary” signature 
T = {1, Li, . . . , Ln\ associated with {•, 1 , inv, Li, . . . , L„}, where 1 is a constant 
symbol, and Li ,. . . , are unary function symbols. As usual, Li\w) denotes 
Li{. . . {Li{w )) . . .), when w is a term in T(lF). 

j times 

In the following, we define a translation from the terms of 'T(T) into the 
terms of T{T). Our first attempt leads to translate for example Li ■ Li ■ L2 into 
Li{Li{L2{l)))- But, this translation has to be compatible with the term algebra 
structure since our aim is to get an isomorphism. In particular, if we add the 
context L^-*- L^to Li- Li' L2, this has to be reflected by the translation. A part of 
the context, namely A3 • •, can be translated into the context but the other 

part, L4, cannot be considered as a context in the term L3(Li(Ai (172(^4(1))))). 
The point is that a ground context in T (IF) has to be divided into a context 
part and a substitution part in T(lF): the translation of L\ ■ L\ ■ L2 should be 
Li{Li{L2{x))) where a; is a variable, and adding the context L^ - • ■ L4 amounts 
to adding the context and to applying the substitution x 1— > L4{x). 

Eventually we come to the idea of defining a translation r from the pairs 
of T(lF) X T(lF, X) into T(lF, X). The intuitive meaning of r is to concatenate 
the translation of its first argument to its second argument. We also take into 
account the equations of Q and Af during the translation, since there is no direct 
way for expressing the inverse of an element in T{T ^ X). Hence r is defined as 

r(l, w) = w 
r(inv(l), w) = w 
Vi r{Li, w) = Li{w) 

Vi r(inv(Aj), w) = (w) 

r(ti • i2, w) = r(ii, r(f2, w)) 
r(inv(ti • t2),w) = r(inv(i2), r(inv(ii), w)) 
r(inv(inv(t)), w) = T{t, w) 

where t,ti and t2 are any ground terms of T{T), and w any term in T(lF, A). 
It is quite obvious that r is completely defined. 

Definition 1 . The translation of an equation I = r where I and r are ground 
terms in T{T) is the equation t{ 1 ,x) = T{r,x), where x is a given variable of X. 
It is denoted by t {1 = r). The translation is extended to a set of ground equations 
in the obvious way. 

Some of the proofs in the following are by induction over the size of terms 
and contexts. This leads to introduce the following definition: 

Definition 2 . The size of a termt ofT(T), ||i:||, is defined as follows: 

l|l|| =1 \\tl-t2 \\ =Pl|| + p 2 || 

V* IjAill = 1 |iinv(ti)|| = maxi{jLj x ||ti|| 

and the size of a context C[u] in T{T), as follows: 

IM 1 = 0 Vz ||I-(C[.])|| = 1 + ||C[.]|| 
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Lemma 1. Lett be any ground term inT{T). Then there exists a unique context 
Ct[»] in T{T) such that for all term w in T{T, X), rlt, w) = Ct[w] and moreover 

\\CtH\<\\t\\. 

Proof. By induction on the term structure of t. 



Proposition 1 . Ift and t' are two ground terms in TIT), such that t =g^uX t' 
(or equivalently t =gmtnUA^ t')> then for all term w in T{T,X), the following 
equality holds: 

T{t,w) =r(N) r{t',w) 

Proof. Since Qc and Gmin are two equivalent presentations of groups, if t =g^uM 
t' , then t =e„,i„uA^ t', hence there exists an equational proof 

t = tg < t\ . . . tjn—1 < = t 

The proof of the lemma is by induction on the multiset {||to||,Pi||,---,Pm||}- 
This is obvious when the proof has length zero. Let us assume that m > 1. 
We shall show that T(to,w) =t(N) 'r(iijw), then we can conclude by induction 
that r(ti, w) =t(N) Titm, w), hence by transitivity of =t(N) that r(to, w) =t(N) 

T{tm,w). 

— Let us assume that the first equational step occurs at the top of the terms 
to and ti. We reason by cases over the equation that has been applied. If the 
equation is {x ■ y) ■ z = x ■ {y ■ z) or x ■ 1 = x, we get the desired result by 
applying the definition of r. 

• Let us assume that the equation is x ■ inv(a;) = 1. Since we only con- 
sider ground terms, we reason by cases over x: x may be of the form 
1, Li, . . . , Ln, ui ■ U 2 or inv(u), where ui, U 2 and u are closed terms. We 
treat into details the cases Li and ui ■ U 2 . The other cases are similar. 

r{Li ■ irw{Li),w) =def r{Li, r(inv(Li), w)) 

=def Li{Li^' ^{w)) 

=t(N) ''■(Ij w) since , x) = r(l, x) € r(A/’). 

t((ui ■ U2) ■ inv(ui • U2),w) =def r(ui, r('U2, r(inv(u2), r(inv(ui), w)))) 

=def t(ui,t(u 2 ■ inv(u2), r(inv('Ui), w))) 
=r(M) r(inv(ui), w))) 

since by induction hypothesis t(u 2 •inv(u 2 ),w') =t(N) for all term, w'. 

=def r('Ui,r(inv(ui), w)) 

=def T{ui-irw{ui),w) 

=r(N) 

since by induction hypothesis t{ui ■ inv(ui), w) =t(N) ''"(1) w). 



• Let us assume that the equation is = 1. 

t{l\’^^ , w) =t{N) ''■( 1 , w) since t{l\’^^ , x) = r(l, x) € r(Af). 
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— Let us assume that the first equational step occurs strictly under the top of 
the terms, at a position p: to = to[la]p < — to[ra]p = with p > e. 
We reason by cases over the top of to and p, there are four cases, the top of 
to is equal to •, and p = 1 ■ p' or p = 2 ■ p' , the top of to is equal to inv and 
p = 1 or p = 1 ■ p' with p' ytz e. We will only consider into details the last two 
cases. 

• If the top of to is equal to inv and p = 1, we reason by cases over the 
equation that has been used. If the equation is (x • y) ■ z = x • {y ■ z) or 
X • 1 = X, we get the desired result by applying the definition of r. 

* Let us assume that the equation is x ■ inv (a;) = 1. 

r{to, w) = r(inv(t'o • inv(t'o)), w) 

=def r(inv(inv(t'o)), r(inv(t'o), w)) 

=def T{t'o,T{inv{t'o),w)) 

=def T{t'o-inv{t'o),w) _ 

=t(M) 't(1,w) induction hypothesis with the term w of T(iF) . 



* Let us assume that the equation is = 1. 



= 


r(inv(Lf*),w) 




=def 


T(inv(Li),r(inv(Li), . . 

v. 


.,r(inv(Lj), w) 






V 

j L . times 




=def 




j X ~ \ 

(w) 




=r{JV) 


w 


since , x) = 


r(l, a;) G r(A/’). 


=def 


r(inv(l), w) 




= 


r{ti. 


, w) 





• If the top of to is equal to inv, and p = 1- p' with p' yf e, there are three 
cases, to = inv(inv(tQ)) withp = 1-1-p", to = inv(tQ-tQ) withp = 1-1-p", 
or p = 1 • 2 • p". We will treat into details only the second case, the others 
are similar, to = inv(tQ • fp), t\ = inv(t'j^ • t'o) and t'^ < — Hence 
inv(tQ) < — inv(t'j^). By induction hypothesis, since the size of 
the equational proof inv(tQ) < — inv(t'j^) is strictly smaller than 
to < — ^ 1 ; we get that for all term w in T{T), 

T(inv(to),w) =r(Af) r(inv(t{),w) 

r(to, w) = r(inv(t'o • t'^),w) 

=def r(inv(t"),T(inv(t'o),w)) 

=r{M) T(inv(t"),r(inv(t;),i^) 

by induction hypothesis with the termw of T(lF), under the context r(inv(fQ), •). 

=def r(inv(f'i • to),w) 

= T{tl,w) 



Proposition 2. Let I = r he any equation between ground terms in TL(T). Then 
for all ground terms s,t in T {T) such that s < — t, for all term w in T {T, X), 



r(s,w) =r(A^U{/=r}) ^(t, w) 
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Sketch of the proof. By induction on the length of p. There are several cases, 
according to p and the top symbol of s, p = e, p = 1 • p' or p = 2 ■ p' and the top 
symbol of s is equal to ■, p = 1 or p = 1 ■ p' (with p' yf e) and the top symbol 
is equal to inv. We shall detail only two cases, the first and the third one. This 
last case explains why we need r(A/’ U {/ = r}), and not only t{1 = r). 

— If p = e, then since I and r are closed terms, s = I and t = r, hence 
t{s,w) < — > rftjw) with the equation t{1,x) = T{r,x), at the top, and with 
the substitution {x ic}. 

— If s = inv(si), p = 1, then t = inv(ti), and si < — t\. Since s and t are 

ground terms, so are si and t\, hence si = ^ , = r. By proposition 1, since 

I -inv(/) (resp. r -inv(r)) and I are ground terms in T{!F), and I -inv(^) =g^ I 
(resp. r • inv(r) =g^ I), for all term w in T{tF, X), 

t{1 ■ inv{l),w) =r(X) 't(1, w) r(r • inv(r), w) =r(X) '^(1, w) 

Hence, we get 

t{s,w) = T(inv(/),rc) 

=t(N) T(inv(i),T(r ■ inv(r),w)) 

Proposition 1 with the term w, under the context r(inv(l),*). 

=def T(inv(l),T(r,T(inv(r),w))) 

=r(i=r) r(inv(l),T(l, r(inv(r),w))) 

under the context r(inv(/), •), withthesubstitutionx i— > r(inv(r), w). 
=def r(inv(;) • I, r(inv(r), w)) 

=riX) r(inv(r),w) 

Proposition 1 with the term r(inv(r),w), at the top. 

= T{t,w) 



Definitions. Let W[u] be any ground context of T{T). The canonical term 
associated with IP[»], is defined as follows: 

'^(.) = 1 '^(L7(W[.])) = L, ■ ^{W[^]) 



Lemma 2. Let W[u] be any ground context in T{T), the following identity 
holds: C'',r(iT[»]) [•] = is defined by lemma 1). Moreover ||IP[»]|| = 

limw)ll- 

The proof follows immediately from the definition. 

Proposition 3. Let t and t' be two ground terms in T(fF) such that 
r(t,T) =t(N) ^^6 equality t =g,,u 7 \t t' holds. 

Sketch of the proof . By induction on the pair (m, {||t||, ||t'||}), with a lexico- 
graphic ordering, where m is the length of the equational proof 
r(t,T) =r(M) T{t',l). 

— We assume first that m is equal to 0, hence that r(t,l) = r(t',T). We reason 
by cases over the term structure of t and t' . We shall only detail the case 
when t = ti ■ t2 and t' = t'l ■ t'2, which is the most interesting one. In this 
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case we get that C'tJC't2[T]] = This means that there exist three 

contexts in T{T), C^2[»] and Usl*], such that 

Ctd>] = Ui[>] Ct, 



CtA*] = U2 [uM] 
c. 



or such that 



Ct 

c. 






= C/l[C/2[.]] 
= 

= C/i[.] 

= C/ 2 [C/ 3 [*]] 



= C/l[C/2[.]] 

We only consider the first case, since the second one is symmetrical. Let Uj 
be the canonical term associated with the context Uj[u], for j = 1 , 2 , 3 . 
T{uj,») = Uj[»] holds and moreover ||Lj[»]|| = ||uj||. We can now rewrite the 
above equalities as 

T{ti,l) = t{ui,T) 
r(t 2 , 1 ) = t{u 2 • U 3 , 1 ) 



r(ti,T) = r(ui •_U2,T) 

■^(^ 2 ) 1 ) = t{u 3 , 1 ) 



||ui|| = ||C/i[.]|| = ||a,[.]||<||ti||<||t|| 



lh2-Z.3|| = |k2|| + |k3|| = ||C/2[.]ll + r3WII< 

||C/i[C/2[.]]ll + raWII = IIQJ*]!! + IIC'^'WII < 

\Wl\\ + \\t'2\\ = \\t'\\ 



Similarly, Ijui • U2II < ||t|| and ||u3|j < Ijt^H < ||t^||- We can apply the induc- 
tion hypothesis on (ti, ui), (t2, U2 • U3), {t'l, u\ • U2), and on (t'2, U3). We get 

fl =QcUj^ Ui t'l =ScUA^ Ui ■ U2 

t2 =gcVAf ^2 • U3 t'2 =g^uM U3 

Hence t = ti ■ t2 =g^yjM ui • (^2 • U3) =g^uM {ui ■ U2) ■ U3 =g„uA^ A ■ A = t' ■ 
— Let us now consider the case when there is at least one step in the equational 
proof r{t, 1 ) =t{M) 1)5 i-e.m> 1 . We consider the first step of this proof; 

it uses an equation of the form t{1, x) = r(r, x), where ^ = r is an equation of 
Af. Hence r(t,T) = VLi [r(^, W2[T])] < — Wi [r(r, H2[T])]. Let ui and U2 
be respectively the canonical terms associated with the ground contexts of 
T(lF), Wi[»] and W2[»]. We get that 

r(f,T) = Wi[t{1,W2\T])] = r(ui,r(/,r(u2,T))) = r(ui • (/ • ^2),!) 

Hence by induction hypothesis, we get that t =g^uM ui ■ {l-U2)- Since I = r is 
an equation of Af, we also have obviously that t 6 i • (/ • U2) =gcUM • (r ' ^2)- 
We denote by t" the term u\ ■ {r ■ U2), and we have that r(f",T) = 
Wi [r(r, 1T2[T])] is equal to r(t',T) with an equational proof of length m — 1 . 
We can apply the induction hypothesis to (t", t'), and we get that t" =g^uM 
t' . Eventually we conclude that t =g^uj\f t' . 



Proposition 4 . Let I = r be a ground equation over T{T), and s and t two 
ground terms ofTL(T). //r(s,T) < — A{i=r) '^(^>1), then s =g^vjMvj{i=r} t. 

Sketch of the proof. The proof is by induction over the length of p, the position 
where the equational step occurs. 

From the above propositions, we get the following 

Theorem 1 . Let s and t be two ground terms ofT(tF). Then 

s =g,uAfue t if and only if r(s,T) =r(Afu£) r{t,T). 
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Proof. — If s =g^uMuS t, then by proposition 1 and proposition 2, we get that 
r(s, w) =r(Mue) w), for all term w of T(lF, A’), in particular for w = 1. 
— If t(s, I) =T( 7 \^uf:) ''■(t, 1), then by proposition 3 and proposition 4, we get 
that s =g^uMu£ t- 



Corollary 1. The quotient algebras T(lF) U Af U 5 and T(lF)/r(A/'U £) are 
isomorphic, in particular, they have the same cardinality. 

Proof. From the above theorem, s r(s, 1) is a one-to-one mapping from 
into T(J^)/r(A/’Uf), and if w is a ground term of T(lF), then 
it has the form C[T], hence w = r('^(C'[*]), I), and the mapping s r(s, 1) is 
surjective. 

3 Computing Normal Forms Automata 

Our aim is to count the elements of a group presented as in section 2. Assum- 
ing that the completion of the “unary” presentation terminates, we build an 
automaton that recognizes the set of ground normal forms for the convergent 
rewriting system R defining this group. If the number m of accepting paths is 
finite then the group we study is finite and has m elements. Otherwise, there 
is a loop which represents an infinite set of distinct elements in the group. Let 
w be the word of function symbols corresponding to the labels in this loop, the 
language of normal forms contains all the w^s. In order to fold the group into 
a finite one, w has to have a finite order p, therefore we shall add an equation 
of the form = 1. Note that we have no information on p, which is left to the 
choice of the user. 

A standard method to build a normal forms automaton for R is to build one 
automaton for each left hand side of the rules in R, to compute the union of these 
automata, to determinize this automaton and then to compute its complement. 
This clearly leads to a normal forms automaton. The main problem of such an 
algorithm is the use of very costly operations on automata: union, determiniza- 
tion and complementation. This algorithm is therefore untractable with large 
automata. We address this problem with a new direct algorithm to compute the 
normal forms automaton. This algorithm avoids the use of union, determiniza- 
tion and complementation on automata. In addition it builds an automaton with 
labeled states. The labels are terms which are interpreted as formulae describing 
the set of elements of the group recognized by this state. We present in this 
paper a very specialized version of the algorithm described in [12]. The later 
applies to general signatures with arbitrary arities, AC symbols and sets of left- 
linear rules. Note that thanks to the translation of section 2 we translate a non 
left-linear rewrite system into a left-linear one. Therefore we avoid the use of the 
conditional automata techniques[lj. 

Definition 4 (Bottom-Up Tree Automaton). A bottom-up tree automaton 
is a quadruple {T,Q,Qf,TZ) where 
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— T is a finite alphabet of function symbols equipped with an arity, 

— Q is a set of states, 

— Qf C Q is a set of final states , 

— TZ is a rewriting system on T\^Q, where the symbols in Q are unary and 
where the rule are of the form f{qi{xi),...,qn{xn)) — *■ q{f{xi,...,Xn)) 
where f € IF has arity n, qi, . . q are in Q and xi, . . . ,Xn are distinct 

variables. Such a rule will be denoted by {qi,...,qn) — ^ q and is called a 
transition. 



Definition 5 (Run of a Bottom-Up Tree Automaton). A run of an au- 
tomaton {T,Q,Qf,TZ) on a ground term t G T{T) is a rewriting sequence ofTZ 
from t to q{t) where q G Q. If such a run exists and q G Qf then t is accepted 
by the automaton else t is rejected. 

The bottom-up tree automata we actually use are very particular ones, since 
the function symbols are of arity < 1 . Therefore, they are very similar to standard 
finite state automata, except that terms are read bottom-up, that is from right 
to left, when they are considered as strings. 

Example 1. Consider the automaton given in section 1. A run of this automaton 
on the term T{T{S{1))) ends up in the final state T^5. Therefore this term 

is accepted by the automaton. The term T (1) is rejected because (T (1)) is 
accepted by the state T* and no transition sources from with the symbol T. 

From now on the states will be uniquely labeled with terms. Therefore we 
will speak of the state-term t instead of the state labeled with the term t. We 
will use the usual vocabulary for terms and states on state-terms. 

Definition 6 (Minimal Generalization of a Term). Let t be a term and let 
T be a set of terms. A minimal generalization oftinT is a term u such that 

~ uGT, 

— t is an instance of u, 

— if V gT and t is an instance of v then u is an instance of v. 



Lemma 3. In the case of a signature where all symbols have an arity equal to 0 
or 1, a minimal generalization of a term in a set is either unique up to renaming 
or does not exist. 

In the following, we denote by i? a set of rules defined over the algebra 
T(T, X). 

Theorem 2. The algorithm presented in figure 2 computes a normal forms au- 
tomaton for the rewriting system R. A state-term t of this automaton recognizes 
exactly the set of terms in ground normal forms T such that for all u G T, t is 
the minimal generalization of u in the set of all state-terms. 



We split the proof into several lemmas. 
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— Construction: 

• States: 

* Start with an automaton with one final state-term 1 

* For each integer 0 < i < n add a final state-term Li(x) 

* For all strict non-variable subterm t of the left-hand side of a rule 
in R add a final state-term t 

• Transitions: _ 

* Add the transition — 1 

* For every state-term t and every symbol Li if Li(t) is not reducible 

by R then add a transition t s where s is the minimal general- 
ization of Li{t) in the set of all state-terms 

— Cleaning: Clean the automaton, removing the inaccessible states. 



Fig. 2. Algorithm to compute the normal forms automaton 



Lemma 4. If a term u is accepted by the automaton in a state-term s then u 
is an instance of s. 

Proof. The proof is by induction on the size of the recognized terms. If u = 1, 
the only accepting state for u is 1. If u = Li{v) is a term accepted by a state-term 
s, then let us consider the last transition of an accepting run; it necessarily of 

the form t s. By definition of a run, v is accepted by the state-term t, and 
by induction hypothesis v is an instance of t, hence u = Li(y) is an instance 
of Lift). Moreover, by construction of the automaton. Lift) is an instance of s. 
Eventually, we get that u is an instance of s. 



Lemma 5. If a term is in normal form then it is accepted by the automaton. 

Proof. The proof is by induction on the size of the normal term u. li u = 1 then 
it is accepted by the state-term 1. Let u = Life). Since u is normal, v is normal. 
By the induction hypothesis v is accepted by a state t and the preceding lemma 
proves that v is an instance of t. Since u = Life) is an instance of Lift), Lift) is 

in normal form. By construction there is a transition t s where s is a final 
state-term. Thus u = Life) is accepted by the automaton in the state s. 



Lemma 6. If u is a normal form accepted by the state-term s then s is the 
minimal generalization ofu. 

Proof. The proof is by induction on the size of u. 

— Case u = 1: the only state accepting u is 1. It is the minimal generalization 
state-term of the terr^l in the set of state-terms. 

— Case u = Life)-, let t — b s be the last transition of an accepting sequence of u 

in the state-term s. Clearly v is accepted in t and by induction hypothesis, t is 
the minimal generalization of v. In particular v = tao, u = Lifu) = Lift)ao. 
Let Gu be the set of the generalized state-terms of u and be the one 

of Li{t). The inclusion {s} C C G„ clearly holds since u is an instance 
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of Li{t) which is itself an instance of s. Since all symbols have an arity < 1, 
the set Gu is totally ordered with respect to the instantiation ordering. Let 
g e Gu, we shall prove that g G ^T~{ty consider the three cases 

described in the following scheme: 



g: case 1 
g: case 2 
g: case 3 



• Cases 1 and 2: g G ^~(t) definition. 

• Case 3: g = Li{t)a = Li{ta). Let g' = ta. Since t is a state-term, it is not 
a variable. Hence g' is not a variable. Therefore g is a state-term which 
is a strict subterm of the left hand side of one rule. Consequently g' is 
also a strict subterm of the left hand side of one rule, g' is a generalized 
term of v therefore g' = t (because t is the minimal generalized term). 
We conclude g G Gj-^^y 

Therefore G—y^ = Gu and they have the same minimal state-term s which 
is the minimal generalized term of u. 



Lemma 7. If a term is reducible then it is rejected by the automaton. 

Proof. Let u be a reducible term. Without loss of generality, we may assume that 
u = Li(u') and u' is in normal form (by taking the minimal reducible subterm of 
u). u' is accepted by its minimal generalized state-term t' . Let I = Li{t") be the 
left hand side of a rule reducing u. u' is an instance of t" and t" is a state-term: 
by minimality t' is an instance of t" . Therefore Li{t') is an instance of I i.e. is 
reducible. The construction of the automaton will not create a transition labeled 
Li coming out from t' . Thus u is rejected. 



Example 2. We detail the computation of the automaton given in section 1 by 
exhibiting the transitions added from two particular state-terms. 

Transitions from the state-term T{x). Note that in section 1, the state T{x) is 
denoted by T in the automaton. 

— Transition by S\ the term S{T(x)) is in normal form. The set of generalized 
state-terms of S{T{x)) is {<S'(a;); S{T{x))}, and the minimal one is S{T{x)). 

Therefore the transition T{x) S{T{x)) is added. 

— Transition by T: T(T(x)) is in normal form. The set of generalized state- 

terms of T{T{x)) is {T(a;); T (x)}, and the minimal one is T (x). Therefore 

2 

the transition T{x) — > T (x) is added. 
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Transitions from the state-term T {S{T{x))). 



— Transition by S: the term S(t'^ {S{T{ x)))) is reducible (by the 6th rule of 
the rewriting system R given in section 1). Therefore no transition is added. 

— Transition by T: the term (S{T{x))) is in normal form. Its set of gener- 
alized state-terms is {T{x); T (x); T (x); T (S'(x))}, and the minimal state- 
term is T^{S{x)). Thus the transition {S{T{x))) — ^ T^(5(x)) is added. 



Note that the computation of the automaton does not rely on the convergence 
of the rewriting system: this computation is possible for any rewrite system, and 
when the system is not confluent, the number of paths in the automaton yields 
an upper bound on the number of distinct elements in the quotient algebra. 
When there are some loops in the automaton, they can be used to guess the 
kind of equations that have to be added in order to “fold” the algebra, in the 
same way as in the convergent case (see the beginning of the section) . 



4 Conclusion 

We have shown how to apply rewriting and automata techniques to the study 
of quotients of SL2{1^). The main technical issues are, first the correspondence 
between a particular class of finitely generated groups and a quotient algebra 
defined over a signature containing only function symbols of arity < 1, and 
second the direct construction of a normal forms automaton for a convergent 
rewrite system built over such a “unary” signature. 

Concerning the first point, we have defined an isomorphism on the ground 
terms of the quotient algebras, which is compatible with the equational theory. 
This translation is possible not only for the equational theory, but also step by 
step for the completion process (when the strategy synchronizes the processing 
of one equation or rule and its extension, in the binary case). The unary pre- 
sentation allows to get rid of the convergent rewriting system for groups and of 
the extensions of the other rules. If the quotient algebra is finite, the completion 
will always terminate, but in case we suspect the completion to diverge, it would 
be interesting to see whether a completion using SOUR graphs [11] could help, 
giving some hints on what equations can be added to the equational theory in 
order to get a finite group. 

When the completion terminates, and yields a convergent rewrite system, the 
automaton may have loops which represent infinite sets of distinct elements in the 
group. Such a loop can be easily represented as a regular language | p G N}, 
thanks to the pumping lemma and may help to understand how to fold the group 
into a finite group. 

A more prospective research is to try to do parameterized completion for some 
particular sets of equations, where there are some integers, which are actually 
parameters; for example some powers of given words occurring in the equations 
defining a class of groups, such as the above integer p. 
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We have been able to treat the example of PS'L2(^/5Z), which is small, but 
not trivial, and it seems to us that this shows that our method is suitable for this 
kind of study. From a practical point of view, we used CzME [3], in which com- 
pletion (and also completion modulo the group theory) is implemented. CzME 
also has the general version of the direct construction of normal forms automa- 
ton, with loop detection and path counting. This means that we are able to treat 
our motivating example in a fully automatic way. 

Our ultimate goal is to be able to treat some examples with several million 
elements, hence for efficiency sake, the construction of normal forms automata 
in the case when there are only symbols of arity < 1 will be implemented. 
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Abstract. We consider equational theories of binary relations, in a lan- 
guage expressing composition, converse, and lattice operations. We treat 
the equations valid in the standard model of sets and also define a hier- 
archy of equational axiomatisations stratifying the standard theory. By 
working directly with a presentation of relation-expressions as graphs we 
are able to define a notion of reduction which is confluent and strongly 
normalising, in sharp contrast to traditional treatments based on first- 
order terms. As consequences we obtain unique normal forms, decidabil- 
ity of the decision problem for equality for each theory. In particular we 
show a non-deterministic polynomial-time upper bound for the complex- 
ity of the decision problems. 



1 Introduction 

The theory of biuary relations is a fuudameutal conceptual and methodological 
tool in computer science. The formal study of relations was central to early 
investigations of logic and the foundations of mathematics [11, 20, 24, 25, 26] 
and has more recently found application in program specification and derivation, 
[2, 6, 4, 18] denotational and axiomatic semantics of programs, [8, 10, 22, 19] 
and hardware design and verification [7, 16]. 

The collection of binary relations on a set has rich algebraic structure: it 
forms a monoid under composition, each relation has a converse, and it forms a 
Boolean algebra under the usual set-theoretic operations. In fact the equational 
theory in this language is undecidable, since it is possible to encode set theory 
[26]. Here we eliminate complementation as an operation, and investigate the set 
Efi of equations between relation-expressions valid when interpreted over sets, 
as well as certain equational axiomatic theories naturally derived from Egi. 

Now, the most popular framework for foundations and for implementations 
of theorem provers, proof-checkers, and programming languages remains the A- 
calculus. It seems reasonable to say that this is due at least in part to the fact 
that the equational theory of A-terms admits a computational treatment which 
is well-behaved: ‘6-reduction is confluent, and terminating in typed calculi, so 
that the notion of normal form is central to the theory. 

To our knowledge, no analogous notion of normal form for terms in Egi is 
known. In fact the calculus of relations has a reputation for being complex. 
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Bertrand Russell (quoted in [21]) viewed the classical results of Peirce and 
Schroder on relational calculus as being “difficult and complicated to so great 
a degree as to doubt their utility.” And in their recent monograph [4, page 81] 
Bird and de Moor observe that “the calculus of relations has gained a good deal 
of notoriety for the apparently enormous number of operators and laws one has 
to memorise in order to do proofs effectively.” 

But in this paper we suggest that a rather attractive syntactic/computational 
treatment of the theory of relations is indeed available, at least for the fragment 
of the theory not including complementation. 

The essential novelty derives from the idea of taking certain graphs as the 
representation of relations. These graphs, called here “diagrams,” arise very nat- 
urally and have been used since Peirce by researchers in the relation community 
(e.g. Tarski, Lyndon, Jonsson, Maddux, etc.); recent formalisations appear in 
[12, 1, 7]. What we do here is to take graphs seriously as a notation alternative 
to first-order terms, i.e., to treat diagrams as first-class syntactic entities, and 
specifically as candidates for rewriting. 

One can see diagram rewriting as an instance of a standard technique in 
automated deduction. It is well-known that certain equations inhibit classical 
term-rewriting techniques — the typical examples are associativity and commu- 
tativity — and that a useful response can be to pass to computing modulo these 
equations. In Table 1 we exhibit a set A-p of equations such that diagrams are 
the natural data structure for representing terms modulo Ex> ■ 



Summary of Results 

It is not hard to see that in the absence of complementation equality between 
relation-expressions can be reduced to equality between expressions not involving 
union, essentially because union distributes over the other operations. So we 
ultimately restrict attention to the complement- and union-free fragment of the 
full signature (see Definition 1). It is known [I, 12] that the set of equations true 
in set-relation algebras in this signature is decidable. 

We clarify the relationship between terms and diagrams by showing that the 
algebra of diagrams is precisely the free algebra for the set ifp of equations 
between terms. It is rather surprising that a finite set of equations accounts 
for precisely the identifications between terms induced by compiling them into 
diagrams. 

Freyd and Scedrov [12] isolated the theory of allegories, a finitely axioma- 
tisable subtheory of the theory of relations which corresponds to a certain 
geometrically-motivated restricted class of morphisms between diagrams. We re- 
fine this by constructing a proper hierarchy of equational theories, beginning with 
the theory of allegories, which stratifies the equational theory of set-relations. 

Our main result is a computational treatment of diagrams via a notion of 
reduction. Actually each of the equational theories in the hierarchy induces its 
own reduction relation; but we prove uniformly that each reduction satisfies 
strong normalisation and Church-Rosser properties. Therefore each theory enjoys 
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unique (diagram-) normal forms and decidability. In fact the decision problem 
for each theory is in NP, non-deterministic polynomial time. 

We feel that the existence of computable unique normal forms is our most 
striking result. The virtue of treating diagrams as syntax is highlighted by the 
observation that E-ji is not finitely axiomatisable [15], so no finite term rewriting 
system can even claim to correctly present the theory, much less be a convergent 
presentation. 

In light of the characterisation of the set of diagrams as the free algebra for 
the set Eti of equations, these results can be seen — if one insists — as results 
about rewriting of terms modulo E-p. But for us the diagram presentation is the 
primary one and is ultimately the closest to our intuition. 



Related Work 

The case for using a calculus of relations as a framework for concepts and meth- 
ods in mathematics and computer science is compellingly made by Freyd and 
Scedrov in [12]. They define allegories as certain categories] the structures mod- 
eled in this paper are in that sense one-object allegories. One may view this as 
the distinction between typed and untyped calculi. 

Bird and de Moor’s book [4] is an extended presentation of the application 
of relational calculus to program specification and derivation, building explicitly 
on the theory of allegories. There, terms in relation calculus are not programs 
per se, but the authors do raise the question of how one might execute relation- 
expressions [4, page 110]. As noted there, a promising proposal is made by Lipton 
and Chapman in [18], where a notion of rewriting terms using the allegory axioms 
is presented. It should be very interesting to explore the relationship between 
the Lipton-Chapman model and the one presented here. 

Brown and Hutton [7] apply relational methods to the problems of designing 
and verifying hardware circuits. They observe that people naturally reason in- 
formally about pictures of circuits and seek to provide formal basis, again based 
on allegories, for such reasoning; their vehicle is the relational language RUBY 
used to design hardware circuits. To our knowledge they do not claim decidabil- 
ity or normal forms for the theory they implement. An implementation of their 
method is distributed at [16]. 

Two other investigations of graphical relation-calculi are the work of Kahl 
[17] and that of Curtis and Lowe [9]. 

The general topic of diagrammatic reasoning has been attracting interest in 
several areas lately (see for example [3]). The present research might be viewed 
as a case-study in reasoning with diagrams in the general sense. 

Further indication of the range of current investigations into relations and 
relation-calculi may be found in, for example, the books [23] or [5] or the pro- 
ceedings of the roughly annual RelMiCS conferences. 



2 Preliminaries 
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Definition 1. The signature S is composed of the binary operations intersection 
n and composition ; (usually written as concatenation), two unary operations 
converse ( )° and domain dom, and a constant 1. 

When we exhibit terms, the composition is to be interpreted in “diagram- 
matic order” so that xy means “x first then y. The operation dom has a natural 
interpretation as the domain of a relation, and under this interpretation it is 
definable from the other operations as dom a; = 1 fl xx° . The inclusion of dom 
in the signature is non-traditional, but there is a very good technical reason for 
its inclusion, made clear in the remark following Theorem 2. 

The standard models are sets and binary relations; the following definition 
is from [1]. 

Definition 2. A (subpositive) set relation algebra is an algebra of the form 
{A, n, ; , ( )°, dom, 1) where A is a set of binary relations on some base set closed 
under the operations, which have the standard relational meaning (1 being the 
identity relation). 

By TZ we will denote the class of algebras isomorphic to set relation algebras 
and by E-jz the set of equations valid in TZ. 

Definitions. A undirected graph g is a pair (Vg,Eg) of sets (vertices and 
edges) together with a map Eg — > [Vg]^, where the elements of [Vg]^ are the 
2-element multisets from V. Such a graph is connected if there is a path between 
any two vertices. An undirected graph h is a minor of g if h can be obtained from 
a subgraph g' of g by a sequence of contractions of vertices of g' . 

The notion o/ directed graph is obtained by replacing [V]'^ by V x V in the 
definition above; note that any directed graph g obviously has an undirected graph 
underlying it. A directed graph g is labelled by a set X if there is a function 
l{g) : Eg — > X. We will be interested in this paper in directed labelled graphs 
g with a distinguished start vertex Sg and a distinguished finish vertex fg. We 
allow these to be the same vertex. 

For the sake of brevity the term graph will always mean: directed, labelled 
graph with distinguished start and finish vertices, whose underlying undirected 
graph is connected. 

Let Q denote the set of such graphs. Strictly speaking the set Q depends on 
the particular set of labels chosen, but this set will never change in the course of 
our work, so we suppress mention of the label-set in the notation. We do assume 
that the set of labels is infinite. 

A morphism (p between graphs g and ft, is a pair of functions pv '■ Vg — Vh 
and ifE '■ Eg — > Eh which 

— preserves edges and direction, i.e., for all v,w G Vg, if e is an edge in g 

between v and w, then PE{e) is an edge in ft between pv{v) and (pv{w), 

— preserves labels, i.e., for all e G Eg, /(e) = l{(pE{e)), and 

— preserves start and finish vertices, i.e., Lpv{sg) = su and pvifg) = fh- 

If it is clear from context, we will simply write (p instead of pv or pE- 
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Fig. 1. The distinguished graphs 1 , 2 ^, and 2 ^ 
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Fig. 2. Operations on graphs 



2.1 Diagrams for Binary Relations 

Here we introduce the central notion of diagram for a relational term. The set of 
diagrams supports an algebraic structure reflecting that of relations themselves, 
and a categorical structure in which morphisms between diagrams correspond 
to equations between relational terms valid in all set relation algebras. This 
material is standard and provides a foundation for our work in the rest of the 
paper. 

There are some distinguished graphs in Q. The graph with only one vertex 
which is at the same time the start and finish, and no edges, will be denoted by 
1 . The graph with edge labelled a from the start vertex to the (distinct) finish 
vertex is denoted 2^; the graph obtained by reversing the sense of the edge is 
2 (j-i. (See Figure 1 .) 

Definition 4. Let 5 , 51,52 be graphs in Q. We define the following operations 
in Q (see Figure 2 for a graphical presentation.) 

1. The parallel composition, 51H52, is the graph obtained by (1) identifying the 
start vertices of the graphs 51,52 (this is the new start), and (2) identifying 
the finish vertices of the graphs 51,52 (the new finish). 

2. The sequential composition, 51I52, is the graph obtained by identifying the 
finish 0/51 with the start 0/52, and defining the new start to be and the 
new finish to be fg.^ . 
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diamond diamond ring 



Fig. 3. Some graphs not in T>. Note the vertices s, / in each case. 

3. The converse of g, denoted by g~^ , is obtained from g by interchanging its 
start and finish. It is important to note that neither labels nor direction of 
edges changes. 

4 . The branching of g, denoted by br(g), is the graph obtained from g by re- 
defining its finish to be the same as the start. 

Not every graph in Q can be built using these operations. Figure 3 gives two 
examples (the edges in these pictures can be directed at will) . Further significance 
of these graphs is given in Theorem 5. 

Definition 5. Let V denote the the set of graphs generated by 1, the 2a, and 
the operations of branching and sequential and parallel composition. 

The set I? is a N'-algebra in a natural way; Theorem 2 below says more. V 
will play a key role in the normalisation process and has interesting properties 
in its own right. 

Let Ts{X) be the set of first-order terms over S with the labels X as vari- 
ables. Then there is a surjective homomorphism 

Ts{X) — > T>, 1 1 -^ gt 

defined recursively by gi = 1, ga = 2a for a G X, gt^-t^ = gt^ \ gt^, 5*10*2 = 
5*1 II 5*21 5*° = (5*)“^ and 5dom* = br(5(). 

We can see the power of diagrams in the following important representation 
theorem. Recall that TZ denotes the class of algebras isomorphic to subpositive 
set relation algebras. 

Theorem 1 ((Freyd-Scedrov, Andreka-Bredikhin)). Let r,t be terms in 
the signature S. Then the equation r = t is valid in TZ if and only if there are 
morphisms gr — > gt and gt — > gr . 

Proof. The relationship between graphs in V and set relation algebras goes as 
follows. For an algebra A with base A and relations Rf, . . . , of A define the 
graph g^x^fi to have the set of vertices A and an edge (a, b) with label j for each 
(a,b) G R-^. Observe that 5^4 ^ is not necessarily in T>. By an induction on terms 
it can be proved that for each term t in Ts{R\, . . . , Rn) and elements a,b G A 
it holds (a, b) G t-^[R] if and only if there is a tj-morphism gt — > g_^ /j which 
takes s to a and / to b. 

The statement of the theorems follows now easily. 



/// 




Normal Forms and Reduction for Theories of Binary Relations 



101 



The strength of Theorem 1 is to reduce equational reasoning in binary rela- 
tions to reasoning about graph theoretical morphisms. In particular since dia- 
grams are finite one can check whether or not there are morphisms between two 
given ones, so we have a decision procedure for equality (in E-jz). 

This result can be improved in at least two directions from a computational 
point of view: 

— Refine the morphisms in order to stratify the equations, hence possibly get- 
ting better computational tools for interesting fragments. 

— Investigate rewrite systems and normal forms in this new representation. 

We pursue these directions in the following two sections. In fact the devel- 
opments are independent of one another, so that the reader interested primarily 
in diagram-rewriting can on a first reading proceed directly to Section 4. 

3 Terms and Equations as Diagrams and Morphisms 

Theorem 1 shows that morphisms between diagrams reflect equations valid in 
the theory of binary relations. Unfortunately it puts all these valid equations 
in one sack. Experience shows that certain equations appear more often than 
others in practice and are in some sense are more fundamental. Our program in 
this section is to classify equations by their operational meaning. 

3.1 Equational Characterisation of T> 



xl 
x{yz) 
xDy 
xD {yD z) 



= X 

= {xy)z 
= y Dx 
= {x Dy) D z 

= X 



{xy)° 

{x n y)° 



= y°x° 

= x° i^y° 



1 ° = 1 
1 n 1 = 1 

(inx)(inj/) = (ina:ny) 

X n j/(i n 0) = (x n j/)(i n z) 

1 n x(y n 2 :) = 1 n (x n y°)z 

dom 1 = 1 
(domx)° = domx 
dom((x r\y)z) = 1 fl x(dom 2 ;)y° 
dom((domx)y) = (domx) n (domy) 



Table 1. The equations Ex>. 



To start with, the class T> itself embodies certain equations in the sense that 
each graph in T> can come from several different terms. It is interesting that 
these identifications can be axiomatised by a finite set of equations, E-p, shown 
in Table 1. The equations in the left column capture the essential properties of 








Fig. 4. Graphs representing some terms in the equations in Table 1. The ones in 
the left correspond to the last three equations about in a;. The ones on the right 
to the last two equations about dom. Observe that in each of these equations 
the left- and right- hand side terms are represented by the same graph. 



the operators (associativity, commutativity, and the involutive laws for converse), 
those in the upper right hand deal with identifications among terms of the form 
1 n X, and the rest take care of the identification of terms which contain the 
operator dom. All of these equations are trivially valid in T>: their left- and 
right-hand sides compile to the same diagram. 

Theorem 2. T> is the free algebra over the set of labels for the set of equations 
Ext. 

Proof. For the non-trivial direction we have to show that if r, t are not provably 
equal under Ex, then gr ^ 9t- This is done by proving that Ex can be completed 
by a finite number of equations into a confluent and terminating rewrite system 
modulo the equations for associativity of composition, AC of intersection and 
1 n x{y (1 z) = lr\{xf}y°)z. A complete proof is in [14]. /// 



3.2 Equations Capturing Morphisms 

Freyd and Scedrov in [12] made the observation (without proof) that I?-mor- 
phisms which collapse at most two vertices at a time correspond to a simple 
and natural equational theory, an abstract theory of relations, the theory of 
allegories. Motivated by this idea we introduce a proper hierarchy of equational 
theories, stratifying Ex in terms of complexity of the morphisms acting on the 
data, each of which has a geometric as well as algebraic aspect. 



Definition 6. Let (p : gi 

jVgJ < \V(p{gi)\ + n. 



92 be an arrow in V. We call (p an n-arrow if 
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xf]x = X 

x{y D z) = x{y C\ z)C\xy 
xyC\z= (x n zy°)y D z. 



Table 2. The operational equations 




Fig. 5. The graph representations of the equations in Table 2. 



Note that in general the composition of n-arrows is not an n-arrow. This 
motivates the following definition. 

Definition 7. Let n > 0 be a natural number. 

1. The category I?" is the category whose objects are the graphs in V, and whose 
arrows are finite compositions of n-arrows in T>. 

2. The theory is the set of equations r = t between E-terms such that 
9r — > gt and gt — > gr in 2?". 

We have the following chain of inclusions of categories: T>^ CT>^ C • • • CD. 
It can be shown that if n > 1 then is closed under deduction. So we have a 
hierarchy of equational theories El^ C Ei^ C • • • C [J . Elj^. 

Theorem 1 can now be rephrased as E-jz = U.Eijz- 

The equational theory of allegories is presented by the axioms in the left 
column of Table 1 plus the ones shown in Table 2. These last three equations 
correspond (in the sense of Theorem 1) to 1-arrows over graphs in T>, as can 
be checked from the graphical representation in Figure 5. The next theorem 
formalises the converse statement i.e., that they are sufficient to axiomatise the 
equations obtained from 1-arrows. 

Theorem 3. Ei^ is exactly the equational theory of allegories. 



Proof. The proof is delicate and we do not present it here; a complete proof can 
be found in [14]. /// 
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Theorem 4. For each n> 2 it holds ^ C (the inclusion is proper). 

Proof. An adaptation of the technique in [12, 2.158] showing that E-jz is not 
finitely axiomatisable. /// 

Unfortunately we do not know yet know much more about each of the steps 
in the hierarchy. But we conjecture that E)^ = E)^ = E)^. We also conjecture 
that for every n > 2, the equational theory E'!^ is finitely axiomatisable. We do 
not know the answer to the question: “is E^f- = for n > 1?” 

4 Normalisation 

This section presents a very general combinatorial lemma concerning the set 
of functions over finite structures viewed as an abstract reduction system. It is 
most conveniently presented using the language of categories, but no more than 
rudimentary category theory is required for the presentation. The material in 
this section is condensed from [13]. 

In this section juxtaposition denotes composition of arrows in a category, and 
is to be read in the standard way, so that fg means “g first.” We use A = B to 
indicate that A and B are isomorphic. 

Definition 8. Let C be any category, and A, B objects ofC. Define the relation 
^ between objects of C as follows. ■ 

A^ B if and only if there are arrows A — > B and B — > A. 

Clearly ^ is an equivalence relation. Our goal is to find simple conditions 
which make the relationship ^ decidable. 

The following notion is motivated by the observation that a (set-theoretic) 
function / between sets A and B can be seen as an map onto its image f{A) 
followed by the inclusion of f{A) into B. 

Definition 9. An arrow m is mono if whenever ma = mb then a = b. An arrow 
e is epi if whenever ae = be then a = b. 

f 

An arrow A — > B has an epi-mono factorisation if there exist arrows e epi 
and m mono such that f = me. A category C has epi-mono factorisation if every 
arrow in A has such a factorisation. 



Definition 10. An object A is hom-finite if the set Elom{A, A) of maps from 
A to A is finite. A category C is hom-finite if each object of C is hom-finite. 

Of course any concrete category of sets of finite objects will be hom-finite. 
In particular, the categories Q, D and for each n are each hom-finite. 

Lemma 1. Suppose that A is hom-finite. If m : A — > A is a monomorphism, 
then m is an isomorphism. Also, If there are monomorphisms m\ : A — > B and 
m 2 '. B — > A, then A and B are isomorphic. 
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Proof. For the first assertion: consider the monomorphisms m, mf, ■■■: A — > A. 
Because A is hom-finite, there are integers i < j such that m* = mP Now, using 
the fact that m is mono, 1a = rn^ for k = j — i. Thus m is a monomorphism 
with a right inverse; this implies that m is an isomorphism. 

For the second claim: we have that m^mi : A — > A is mono, hence, by the 
first part it is an isomorphism. Hence there exists / with m 2 mif = 1^. Thus m 2 
is a monomorphism with a right inverse; this implies that it is an isomorphism. 

/// 

Definition 11. Let A, B be objects in a category C. Define A B if and only 
if B is both (the target of) a quotient- and (the source of) a sub-object of A; that 
is, there is an epimorphism e and a monomorphism m such that 

A^B ^ A. 

We also require that e not be an isomorphism. 

By we will denote the reflexive-transitive closure of^^, where reflexivity 
is defined up to isomorphism. Thus A B means either A = B or there is a 
finite sequence A C\ Cn => B. 



Lemma 2. If A is hom-finite then there are no infinite -reductions out of 

A. 

Proof. For sake of contradiction suppose there were such a sequence. For each i 
we have maps Ai — A — b Ai, and so we may define Oj = mi • • • miCi ■ ■ ■ e\ : 

A->-A. Since A is hom-finite there are i < j with Oj = Oj. Cancelling the monos 
on the left and the epis on the right, we have 1a = mi+i • • • mjCj ■ ■ ■ Cj+i. This 
implies that Cj+i is iso, a contradiction. /// 

Observe that A B implies that A ^ B. The converse need not be true 
in general, but the next result provides a strong converse in certain categories. 

Proposition 1. Suppose C is a hom-finite category with epi-mono factorisation. 
Then if A B, then there exists C such that A C <;= B. 

Proof. The proof is by Noetherian induction over ==^, out of the multiset {A, Bj. 

We are given A — ^ B — ^ A. If both / and g are mono then by Lemma 1 A 
and B are isomorphic and we may take C to be A. Otherwise, by symmetry we 
may suppose / is not mono without loss of generality. Factor the arrow gf as 
epi-mono, obtaining: A — ^ X ^ A. Now, e is not mono, otherwise gf would be, 
contradicting the assumption that / is not mono. In particular e is not iso, and 
so A X. Since X ^ B we may apply the induction hypothesis to {X,B}, 
obtaining C with A X C <= B as desired. /// 
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The previous results imply that the relation is a terminating and con- 
fluent abstract reduction system capturing the equivalence relation 

Corollary 1 ((Normal Forms for ^)). Suppose C is a hom-finite category 
with epi-mono factorisation. 

If A B, then there is a C , unique up to isomorphism, such that A C 
and B C and C is -irreducible. 

Proof. By Lemma 2 we may let C and C' be any ^=i>-irreducible objects such 
that A C and B C' respectively. Then C ^ C . But Proposition 1 and 
the irreducibility of C and C' imply that C and C' are isomorphic. /// 

Observe that in the preceding Corollary we have A ^ C and B ^ C. Also 
note that by taking i? to be A we may conclude that for each A there is a C, 
unique up to isomorphism such that A C and C is ^=i>-irreducible. We refer 
to such a C as a“=i>-normal form for A”. 



5 Normal Forms for Diagrams 



We want to apply the results of the previous section to the categories T> and 
T>”. The following facts about T> and each I?” are easy to check: (i) a map is 
epi if and only if it is surjective on vertices and on edges, and (ii) a map is an 
isomorphism if it is bijective on vertices and on edges. The next result is deeper; 
a proof can be found in [14]. 



Theorem 5. Let g G G- Then g is inV if and only if the underlying undirected 
graph of g does not have diamond and diamond ring (see Figure 3) as minors. 

In particular, the set of graphs V is closed under the formation of subgraphs, 
i.e., if g G T> and h is a connected subgraph containing Sg and fg then h GT>. 

Observe that the categories I?” have the same objects as T> so the above 
theorem applies immediately to the I?”. Theorem 5 is crucial in verifying that 
I?” supports the techniques of the previous section. 

Proposition 2. The categories V and each P” are hom-finite and has epi-mono 
factorisation. 

Proof. The first assertion is easy to see. For the second, let (p : gi — > g 2 be an 
arrow in I?”. Then the graph <p{gi) is a subgraph of g 2 G T>, hence by Theorem 5 

it is also in T>. So we have gi pigi) 52 > where ip' and i are epi and mono 
respectively. Ill 
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Fig. 6. A graph in T> (top) and possible reductions. Observe that the reduction of 
the graphs in the middle to a common graph (bottom) depends on the existence 
of the operator br (dom). 



Remark. It is precisely here that we see the benefit of our extended signature. 
The class of diagrams built over the traditional signature — without dom — 
does not have epi-mono factorisation, due to the fact that it is not closed under 
subgraphs. Figure 6 shows this by example. 

We can now present our main results. 

Theorem 6. Let C he either T> or one of the I?”. 

If g ^ h then there is a k, unique up to isomorphism, such that g k and 
h k and k is -irreducible. 

For each graph g, there is a graph nf(g) such that nf(g) is -irreducible 
and such that for any h, g ^ h in C if and only if nf(g) = nf(h). The graph 
nf(g) is unique up to isomorphism. 

Proof. This is an immediate consequence of Corollary 1 and Proposition 2. /// 

It is important to note that the notions and in the previous Theorem 
are taken relative to the category {T> or T>”) one has chosen to work in. 

Theorem 7. For V and for each I?” the relation 7 ^ is decidable in non-deter- 
ministic polynomial time. Each theory is decidable in non- deterministic poly- 
nomial time. 
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Proof. Decidability follows from the previous results and the fact that is 
computable. To get the NP upper bound we examine the complexity of reduction 
to normal form. If A A! then the sum of the number of vertices and edges 
of A must exceed that of A! , since epimorphisms are surjections and bijections 
are isomorphisms (and we know the map from A to A! is not an isomorphism 
by definition) . So any reduction of a diagram A to normal form takes a number 
of steps bounded by the size of A. So to test whether A ^ B we can generate 
sequences of morphisms reducing each of A and B — not necessarily to normal 
form — and test that the results are isomorphic. The latter test is of course itself 
in NP. 

The second assertion follows immediately from the definition of j j j 

6 Conclusion 

We have examined the equational theory E-ji of binary relations over sets and 
a family of approximations to this theory. The theory Ef^ is Freyd and 
Scedrov’s theory of allegories. By working with a natural notion of diagram for 
a relation-expression we have defined a notion of reduction of a diagram which 
yields an analysis of the theories above. A surprisingly important aspect was the 
inclusion of a “domain” operator in the signature: the corresponding operation 
is definable in terms of the traditional operations, but the class of diagrams for 
the enriched signature has better closure properties. 

Since each notion of reduction of diagrams is terminating and confluent, we 
may compute unique normal forms for each of the theories. Each theory is there- 
fore decidable and in fact normal forms can be computed in non-deterministic 
polynomial time. 

The decidability of Ef^ is reminiscent of the decidability of E-ji, but has been 
more difficult to establish. This is because although equality in E-ji, is witnessed 
by any pair of graph morphisms between diagrams, equality in Ef^ is witnessed 
by a sequence of restricted morphisms. The length of this sequence could be 
bounded only after our work relating ^ and 
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Abstract. Parallelism constraints are logical descriptions of trees. They 
are as expressive as context unification, i.e. second-order linear unifica- 
tion. We present a semi-decision procedure enumerating all “most general 
unifiers” of a parallelism constraint and prove it sound and complete. 
In contrast to all known procedures for context unification, the pre- 
sented procednre terminates for the important fragment of dominance 
constraints and performs reasonably well in a recent application to un- 
derspecified natural language semantics. 



1 Introduction 

Parallelism constraints [7, 16] are logical descriptions of trees. They are equal 
in expressive power to context unification [4], a variant of linear second-order 
unification [13, 18]. The decidability of context unification is a prominent open 
problem [20] even though several fragments are known decidable [22, 21, 4]. 

Parallelism constraints state relations be- 
tween the nodes of a tree: mother-of, sibling- 
of and labeling, dominance (ancestor-of), dis- 
jointness, inequality, and parallelism. Paral- 
lelism 7 ri/ 7 r 2 ~ 7 r 3 / 7 T 4 , as illustrated in Figure 
1 , holds in a tree if the structure of the tree 
between the nodes tti and 7 T 2 — i.e., the tree 
below 7 Ti minus the tree below 7 T 2 — is iso- 
morphic to that between tts and 714 . 

Parallelism constraints differ from context 
unification in their perspective on trees. They 
view trees from inside, talking about the nodes of a single tree, rather than from 
the outside, talking about relations between several trees. This difference has im- 
portant consequences. First, it is not only a difference of nodes versus trees but 
also one of occurrences versus structure. Second, different decidable fragments 
can be distinguished for parallelism constraints and context unification. Third, 
different algorithms can be devised. For instance, the language of dominance 
constraints [15, 24, 1, 9] is a decidable fragment of parallelism constraints for 
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** Supported through the Collaborative Research Center (SFB) 378 of the DFG, the 
Esprit Working Group CCL II (EP 22457), and the Procope project of the DAAD. 




Fig. 1. Parallelism 

7ri/7r2~7r3/7T4 
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which powerful solver exist [6, 5, 16]. But when encoded into context unification, 
dominance constraints are not subsumed by any of the decidable fragments men- 
tioned above, not even by subtree constraints [23], although they look similar. 
The difference is again that dominance constraints speak about occurrences of 
subtrees whereas subtree constraints speak about their structure. 

Parallelism constraints form the backbone of a recent underspecified analysis 
of natural language semantics [7, 11]. This analysis uses the fragment of dom- 
inance constraints to describe scope ambiguities in a similar fashion as [19, 2], 
while the full expressivity of parallelism is needed for modeling ellipsis. An earlier 
treatment of semantic underspecification [17] was based directly on context uni- 
fication. The implementation used an incomplete procedure [10] which guesses 
trees top-down by imitation and projection, leaving out flex-flex. This procedure 
performs well on the parallelism phenomena encountered in ellipsis resolution, 
but when dealing with scope ambiguities, it consistently runs into combinatoric 
explosion. To put it differently, this procedure does not perform well enough on 
the context unification equivalent of dominance constraints. 

In this paper, we propose a new semi-decision procedure for parallelism con- 
straints built on top of a powerful, terminating solver for dominance constraints. 
We prove our procedure sound and complete: We define the notion of a minimal 
solved form for parallelism constraints, which plays the same role as most general 
unifiers in unification theory. We then show that our procedure enumerates all 
minimal solved forms of a given parallelism constraint. 

Plan of the Paper. In the following section, we describe the syntax and seman- 
tics of dominance and parallelism constraints. Section 3 presents an algorithm 
for dominance constraints which in section 4 is extended to a semi-decision pro- 
cedure for parallelism constraints. In sections 5 and 6 we sketch a proof of sound- 
ness and completeness. Section 7 further extends the semi-decision procedure by 
optional rules, which yield stronger propagation speeding up a first prototype 
implemention. Section 8 concludes. Many proofs are omitted for lack of space 
but can be found in an extended version [8]. 



2 Syntax and Semantics 

Semantics. We assume a signature S of function symbols ranged over by f,g ,.. ., 
each of which is equipped with an arity ar(/) > 0. Constants are function symbols 
of arity 0 denoted by a, b. We further assume that S contains at least one 
constant and a symbol of arity at least 2. 
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A (finite) tree r is a ground term over A, for instance 
f{g{a,a)). A node of a tree can be identified with its path 
from the root down, expressed by a word over N+, the set of 
natural numbers excluding 0. We write e for the empty path 
and 7ri7T2 for the concatenation of tti and 7T2. A path tt is a 
prefix of a path tt' if there exists some (possibly empty) tt" 
such that 7T7r" = tt'. 

A tree can be characterized uniquely by a tree domain (the 
set of its paths) and a labeling function. A tree domain U is a finite nonempty 
prefix-closed set of paths. A path tti e U is the z-th child of the node/path 
TT G D. A labeling function is a function L : D ^ S fulfilling the condition that 
for every tt G D and k > 1, Trk G D iS k < ar(L(7r)). We write Dt for the domain 
of a tree r and Lr for its labeling function. For instance, the tree r = f{g{a, a)) 
displayed in Fig. 2 satisfies = {e, 1, 11, 12}, Lr(e) = /, F,-(l) = g, and 
Lrlll) = a = Lr{l2). 

Definition 1. The tree structure AF^ of a tree t is a first-order structure with 
domain Dr- It provides a labeling relation C for each f G S: 

■F = {(tt, 7t 1, . . . , 7m) I Lr{TT) = /, 3r(/) = n} 

We write ^ 7r:/(7Ti, . . . , 7 t„) for (tt, tti, . . . , 7t„)g this relation states 
that node tt of r is labeled by / and has tti as its z-th child (for 1 < z < zz) . Every 
tree structure AF" can be extended conservatively by relations for dominance, 
disjointness, and parallelism. Dominance is the prefix relation between paths 
7T<i*7r'; restricted to Dr, it is the ancestor relation of r; we write 7r<+7r' if 7r<i*7r' 
and 7T yf tt' . Disjointness ttAtt' holds if neither 7r<*7r' nor 7r'<*7r. Concerning 
parallelism, let betw,-(7ri, TT 2 ) be the set of nodes in the substructure of r between 
7Ti and 7T2: If 7ri<*7T2 holds in AI"^, we define 

betw,-(7ri, 712 ) = {tt G Dr \ tti<* tt but not 7T2<~'’7r}. 

The node 7T2 plays a special role: it is part of the substructure of r between tti 
and 7T2, but its label is not. This is expressed in Def. 2, which is illustrated in 
Fig. 1. 

Definition 2. Parallelism Ad"^ \= 7ri/7r2~7r3/7T4 holds iff 7ri<*7T2 and 7r3<*7T4 
are valid in Ai'^ and there exists a correspondence function c : betw,-(7ri, 712 ) ^ 
betw ,-( 7 r 3 , 714 ), a bijective function which satisfies c{tti) = 713 and c{tt2) = tt^ and 
preserves the tree structure o/AT^, i.e. for all tt G betw,-(7ri, 712 ) — {7T2}, f G S, 
and n = ar(/).- 

\=TT\f{TTl,...,TTn) iff \=c{TT):f{c{TTl),...,c{TTn)) 




a a 



Fig. 2 . 

f{g{a,a)) 



Lemma 1 . If c \ betw,-( 7 ri, 712) ^ betw,-(7r3, 714) is a correspondence function, 
then c{ttitt) = tt^tt for all ttitt G betw,-( 7 ri, 712). 
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Syntax. We assume an infinite set V of (node) variables ranged over by 
X,Y, Z,U,V,W . A (parallelism) constraint 4> \s & conjunction of atomic con- 
straints or literals for parallelism, dominance, labeling, disjointness, and in- 
equality. A dominance constraint is a constraint without parallelism literals. 
The abstract syntax of parallelism constraints is defined as follows: 

::= XJX2^YJY2 \ X<*Y \ A:/(Ai , . . . , A„) (ar(/) = n) 

I X_LF I Xjt^Y I false \ p A i) 



Abbreviations: X=Y for X<*YAY<*X and X<f^Y for X<*Y A X^Y 



For simplicity, we view parallelism, inequality, and disjointness literals as 
symmetric. We also write XRY, where R € {<*,<■'■, _L, yf, =}. A richer set of 
relations could be used, as proposed in [6], but this would complicate matters 
slightly. For a comparison to context unification, we refer to [16]. An example 
for the simpler case of string unification is given below (see Figure 4). 

First order formulas (P built from constraints and the usual logical connectives 
are interpreted over the class of tree structures in the usual Tarskian way. We 
write V{<P) for the set of variables occurring in <P. If a pair (AT^,a) of a tree 
structure Xi'^ and a variable assignment a : — > Dr, for some set Q A V(<?), 

satisfies we write this as ,a) ^ <P and say that {Xi^ , a) is a solution of 
We say that <P is satisfiable iff it possesses a solution. Entailment <P \= (p' means 
that all solutions of <P are also solutions of <P'. 



We often draw constraints as graphs with the nodes 
representing variables; a labeled variable is connected to 
its children by solid lines, while a dotted line represents 
dominance. For example, the graph for A:/(Ai, A 2 ) A 
Xi<*Y A X 2 <*Y is displayed in Fig. 3. As trees do not 
branch upwards, this constraint is unsatisfiable. 

Parallelism literals are shown graphically as well 
as textually: the square brackets in Fig. 4 illustrate 
the parallelism literal written beside the graph. This 
graph encodes the string unification [14] problem 



Xi 

• Y 

Fig. 3 . An unsatisfi- 
able constraint 

/Xj- Y, /Yj 





qx = xq; the two brackets represent the two occur- 

"Pis’ 4 String’ nnifira 

rences of x. Disjointness and inequality literals are , ° 

not represented graphically. 



3 Solving Dominance Constraints 

Our semi-decision procedure for parallelism constraints consists of two parts: 
a terminating dominance constraint solver, and a part dealing with parallelism 
proper. Having our procedure terminate for general dominance constraints and 
perform well for dominance constraints in linguistic applications was an impor- 
tant design requirement for us. 
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In this section, we present the first part of our pro- 
cedure, the solver for dominance constraints. This solver, 
which is similar to the algorithms in [12, 6] and could in 
principle be replaced by them, terminates in non-deter- 
ministic polynomial time. Actually, satisfiability of domi- 
nance constraints is NP-complete [12]. Boolean satisfiability is encoded by forc- 
ing graph fragments to “overlap” and making the algorithm choose between 
different possible overlappings. For instance, the constraint to the right entails 
X=Y V X=Yi. The solver is intended to perform well in cases without overlap, 
where distinct variables denote distinct values. This can typically be assumed in 
linguistic applications. 

We organize all procedures in this paper as saturation algorithms. A satura- 
tion algorithm consists of a set of saturation rules, each of which has the form 
(fi for some n > 1. A rule is a propagation rule if n = 1, and a dis- 

tribution rule otherwise. The only critical rules with respect to termination are 
those which introduce fresh variables on their right hand side. A rule ip ^ is 
correct if ^ 3V<1> where V = V(d>) — V(ip). 

By a slight abuse of notation, we identify a constraint with the set of its 
literals. This way, subset inclusion defines a partial ordering C on constraints; 
we also write for the corresponding equality C n 3, and C for the strict 

variant C n This way, we can define saturation for a set S of saturation 

rules as follows: We assume that each rule p G S comes with an application 
condition Cp{ip) deciding whether p can be applied to ip or not. A saturation 
step consists of one application of a rule in S: 

ip' C ip pGS . , 

if Cp{ip) where p is ip ^ ^ i=\Vi 

ip A Vi 

For this section, we let C,pl^\J^__^,p^{ip) be true iff ^ for all 1 < z < n. We 
call a constraint S-saturated if it is irreducible with respect to and clash-free 
if it does not contain false. We also say that a constraint is in S-solved form if 
it is 5-saturated and clash-free. 

Figure 6 contains schemata for saturation rules that together solve dominance 
constraints. Let D be the (infinite) set of instances of these schemata. Both 
clash schemata are obvious. Next, there are standard schemata for reflexivity, 
transitivity, decomposition, and inequality. Schema (D.Lab.Dom) declares that 
a parent dominates its children. 

We illustrate the remaining schemata of propagation rules by an example: We 
reconsider the unsatisfiable constraint X:f{Xi , X 2 ) A Xi<fY A X 2 <*Y of Fig. 3. 
By (D.Lab.Disj), we infer A 1 TA 2 , from which (D.Prop.Disj) yields Y±Y, which 
then clashes by (D. Clash. Disj). 



Y 




Fig. 5. Overlap 
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Propagation rules: 
(D. Clash. Ineq) 
(D. Clash. Disj) 
(D.Dom.Refl) 
(D.Dom. Trans) 
(D.Eq.Decom) 
(D. Lab. Ineq) 
(D. Lab. Disj) 
(D. Prop. Disj) 
(D.Lab.Dom) 



X^Y A X^Y ^ false 
XXX false 

if) X<f X where X e V{if) 

X<*Y A Y<i*Z X<*Z 

X:f(Xi,...,X„) A y:/(yi,...,y„) A X ^Y ^ Xi=Y 
X:f{. . .) A Y:g{. . .) ^ X/y where / 7^ 5 
X:f(...Xi,...,Xj,...) XiXXj for 1 < i < j <n 
X_iy A X<l*X' A Yo*Y' y'_LX' 

X:f{...,Y,...) ^ X<+Y 



Distribution rules: 

(D.Distr.NotDisj) X<i*Z A Y<i*Z X<i*Y V y<i*X 
(D.Distr.Child) X<i*y A X-.f{Xi,...,X„)^Y^X V \l"^^Xi<fY 



Fig. 6. Solving dominance constraints: rule set D 



There are only two situations where distribution is nec- 
essary. The situation shown in Fig. 7 is handled by 
(D.Distr.NotDisj): the tree nodes denoted by X and Y can- 
not be at disjoint positions because they both dominate Z. 

The distribution rule (D.Distr. Children) is applicable to the 
constraint in Fig. 5: As the constraint contains Y :f{Yi , Y 2 )A 
y<i*A, we must have either Y=X or yi<*A or Y 2 <*X. Propagation proves that 
the third choice results in a clash, while the others lead to satisfiable constraints. 



Fig. 7. 

Nondisjointness 



Proposition 1 (Soundness). Any dominance constraint in D-solved form is 
satisfiable. 

Along the lines of [12]. On the other hand, the saturation algorithm for D is 
complete in the sense that it computes every minimal solved form of a dominance 
constraint. 



Definition 3. Let ip, ip' he constraints, S a set of saturation rules and ^ an 
partial order on constraints. Then ip' is a ^-minimal S-solved form for ip iff ip' 
is an S-solved form that is X-minimal satisfying ip di P>' ■ 

For dominance constraints, we can simply use set inclu- 
sion. As an example, a C-minimal D-solved form for the con- *. 

straint in Fig. 8 is A<i*y A X<i* Z A X<*X A Y<*Y A Z<*Z. y* »z 
(Note that X does not need to be labeled.) 

s . Fig. 8. A solved 

Lemma 2 (Completeness). Let ip he a dominance 

constraint and ip' a C-minimal D-solved form for ip. Then 
^ / 

ip^jyif . 
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Proof. By well-founded induction on the strict partial order D on the set {fj \ 
■if C If ip is _D-solved then ip p' by minimality and we are done. 

Otherwise, there is a rule if in D which applies to p. Since p ^ p' 

and p' is in U-solved form, there exists an i such that ifiQp'. By the inductive 
hypothesis, p Aifi p' and thus p p' . 

4 Processing Parallelism Constraints 

We extend the dominance constraint solver of the previous section to a semi- 
decision procedure for parallelism constraints. The main idea is to compute the 
correspondence functions for all parallelism literals in the input constraint (com- 
pare Def. 2). We use a new kind of literals, path equalities, to accomplish this 
with as much propagation and as little case distinction as possible. 

We define the set of variables betw,p(Xi , X2) between Xi and X2 as the 
syntactic counterpart of the set of nodes betw,-(7ri, 712): If Xi<s*X2 € p, then 

het^,^{Xi,X2)= {X&V{p) \ Xi<i*X&p and {X<i*X2\&p or XLX2&P)}. 

Given a parallelism literal Xij X2^YijY2, we need to establish a syntactic 
correspondence function c : betw,p(Xi, X2) — > betw,p(Fi, F2)- In doing this, we 
may have to add new local variables to p. In the following, we always consider 
a constraint p together with a set C V of global variables; all other variables 
are loeal. For an input constraint p, we assume V{p) C Q. 

We record syntactic correspondences by use of a new, auxiliary kind of 
constraints: a path equality P ( y ) states, informally speaking, that X be- 
low Xi corresponds to Y below Yi. More precisely, a path equality relation 
Xi'^ \= p (((^ ((p is true iff there exists a path tt such that 7T2 = ttitt and 
7!"4 = ttstt, and for each ttM+tt, LT-( 7ri7r') = We are only interested in 

“generated” constraints where each path equality establishes a correspondence 
for some parallelism literal. 

Definition 4. Let p be a constraint, p is called generated iff for any p ( ^ 
p there exists some atomic parallelism constraint X\j X2^Y\jY2 G p such that 
U\=Xi A V\=Yi is in p, and U2 G betw,p(Afi, Y2) or V2 G betw,^(Yi, ^2)- 

If U2 G hetvj ,p{Xi, X2), then it must correspond to V2 and inference will deter- 
mine that V2 must be between Yi and Y2, and vice versa. 

Figure 9 shows the schemata of the sets P and N of saturation rules handling 
parallelism. The rule set U U P U forms a sound and complete semi-decision 
procedure for parallelism constraints, which we abbreviate by DPN (and accord- 
ingly for other rule set combinations) . In section 7 we extend DPN by a set T of 
optional rules affording stronger propagation on path equality literals. DPN is 
more succinct and thus forms the basis of our proofs. DP NT is more interesting 
for practical purposes. 
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Propagation Rules 
(P.Root) 

(P.Copy.Dom) 

(P.Copy.Lab) 

(P.Path.Sym) 

(P.Path.Dom) 

(P.Path.Eq.l) 

(P.Path.Eq. 2 ) 

Distribution Rules: 

(P.Distr.Crown) Xi<*X A XxlX2^Y\IY2 ^ X<l*X2 V XYX2 V XzC+X 
(P.Distr. Project) Lp X~Y V X^Y where X,Y G 

Introduction of local variables: 

(N.New) <p A Xi/X2^Yi/Y2 p where X e 

betw^3(Xi, X2); X' new and local 

Fig. 9. Schemata of rule sets P and N for handling parallelism 



X,/X 2 ^Y,/Y 2 ^ p (^1 ) a p (^1 li) 

U1RU2 A ALi Piu]v\)^ V1RV2 where R G {<*, _L, A} 
f/o:/([/i,...,[/„) A ArJo Piulv,) ^ ^i/^2~Fi/F2^ 
Vo'.f{Vi , . . . , Vn) where U0YX2 G (p or Uo<^X2 G p> 

Piul)^ Pilu) 
p(^l) ^ x<fu A y<i*F 
Pix\x\) A 

viuv)^U=V 



Fig. 10. 

Correspondence 



Schema (P.Root) states, with respect to a paral- 
lelism literal Xij X^^YijY^, that X\ corresponds to 
Yi and X2 corresponds to Y2. To see how to go on 
from there, consider the constraint in Fig. 10. Vari- ' ' 

able X is between X\ and X2, and Y is between Fi Y* 

and Y2- But they are just dominated by X\ and Fi, v v : 

I ^2® 

respectively, their position is not fixed. So it would be x / x Y / Y 

precipitous to assume that X and Y correspond — 1212 

there is nothing in the constraint which would force p,. 
us to do that. Schema (N.New) acts on this idea as „ 
follows: Given a literal X1/X2-P1/P2 and a variable Correspondence 
X G betw,p(Xi, X2), correspondence p ^)) is stated between X and a vari- 
able X' A '^{p) U If the structure of the constraint enforces correspondence 
between X and some other variable Y G betw,p(Yi, F2)) then this will be in- 
ferred by saturation. (N.New) need only be applied if X does not yet possess 
a correspondent within X\j X2^Y\jY2. We adapt the application condition for 
(N.New)-rules accordingly: 



n xi vi (v) is true iff x'^v(ifi)ug and p for all variables y 

P < X X'* 



Recall that Q is the set of global variables with respect to which we saturate our 
constraint. Given X\j X2^Y\jY2 G <pi, (P.Gopy.Dom) and (P.Gopy.Lab) copy 
dominance, disjointness, inequality, and labeling literals from betw,^(Xi, X2) to 




118 



Katrin Erk and Joachim Niehren 




Fig. 11. Resolving an atomic parallelism constraint 



Xj/Xj- Yj/Yj 

Fig. 12. X “inside” or 
“outside” ? 



betw^(Yi , I 2 ) and vice versa. The condition on the position of Uq in (P.Copy.Lab) 
makes sure that the labels of X 2 and I 2 are not copied. 

P contains two additional distribution rule sche- 
mata. (P.Distr. Crown) deals with situations like that 
in Fig. 12: We have to decide whether X is in 
betw^(Xi, X 2 ) or not. Only then do we know whether 
we need to apply (N.New) to X. (P.Distr. Project), on 
the other hand, guesses whether two variables should 
be identified or not. It is a very powerful schema, so 
we do not want to use it too often in practice. Much of what it does by case 
distinction is accomplished through propagation by the optional rule schemata 
we present in section 7. 

How does syntactic correspondence as established by DPN 
relate to semantic correspondence functions as defined in Def. 2? 

(P.Root) implements the first property of correspondence func- 
tions, the ’’preservation of tree structure” property remains to be 
examined. Consider Fig. 11. Constraint 1 constitutes the input to 
the procedure, while constraint 2 shows, as grey arcs, the corre- 
spondences that must hold by Def. 2. These correspondences are 
computed by DPN: We infer P (xj yj) P (xa ^ 2 ^ (P.Root). 

(N.New) is applicable to X and yields p ( ^ ) for a new local Fig. 13. 

variable X'. We have Xi<'^X 2 by (D.Lab.Dom), so we may apply Self-overlap 
(P.Copy.Lab) to Xi:f{X 2 ,X) and get Yi:f{Y 2 ,X'). But since the 
constraint also contains Yi:f{Y 2 ,Y), (D.Eq.Decom) gives us X'=Y, from which 
(P.Path.Eq.l) infers p y ). We see that the structure of the constraint has 
enforced correspondence between X and Y, and saturation has made the correct 
inferences. 

While DPN computes only finitely many solved forms for the constraint in 
Fig. 11, the constraint in Fig. 13 possesses infinitely many different solved forms. 
One solved form contains Xi=X 2 =Yi=Y 2 . Another contains Xi<s~^ X 2 =Yi<s~^Y 2 . 
For the case of Ai<+Yi<+A 2 <'*"l 2 , there is one solved form with one local vari- 
able, two with two, one with three, two with four, and so on ad infinitum. 



X,/X,~ Y, / Y, 
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5 Soundness 

Clearly, all rules in DPN are correct. For the soundness of UPAi-saturation is 
remains to show that generated UPA^-solved forms are satisfiable. First, we show 
that a special class of generated UPA^-solved forms, called ’’simple”, are satisfi- 
able. Then we lift the result to arbitrary generated DPN-solved forms. We can 
safely restrict our attention to generated constraints: 

Lemma 3. Let (p be a constraint without path equalities and let ^dpn p' ■ 
Then ip' is generated. 



Definition 5. Let (p he a constraint. A variable X G V{lp) is called labeled in <p 
iff 3X' G V{p>) such that X=X' and X':f{Xi , . . . , Xn) are in (p for some term 
f{Xi , . . . , Xn). We call (p simple if all its variables are labeled and there exists 
some root variable Z G V{ff) such that Z<\* X is in <p for all X G 



Lemma 4. A simple generated constraint in DPN-solved form is satisfiable. 

Proof. The constraint graph of a simple generated constraint ip in DPN-soh/ed 
form can be seen as a tree (plus redundant dominance edges and parallelism lit- 
erals). So we can transform (p into a tree r by a standard construction. For every 
parallelism literal in (p, the corresponding parallelism holds in AT”: As suggested 
by the examples in the previous section, DPN enforces that the computed path 
equalities encode valid correspondence functions in A4”. 

Now suppose we have a generated non-simple con- 
strain! (p in DPN-solved form. Take for instance the • Vstx^z 

constraint in Fig. 14. We want to show that there is #z=u 

an extension (pAff of it that is simple, generated, and 

in DPA-solved form. We proceed by successively la- Fig. 14. Extension 
beling unlabeled variables. Suppose we want to label 

X first. The main idea is to make all variables minimally dominated by X into 
X’s children, i.e. all variables V with X<s~^V such that there is no intervening 
W with X<+W<i+V. 

So in the constraint in Fig. 14, Y, Z, U are minimally dom- 
inated. However, we choose only one of Z, U as we have 
Z=U. Hence, we would like to label X by some function 
symbol of arity 2, extending the constraint, for instance, 
by X'.ffY^Z). (If there is no symbol of suitable arity in 
A, we can always simulate it by a constant symbol and 
a symbol of arity > 2.) However, we have to make sure 
that we preserve solvedness during extension. For exam- 
ple, when adding X:f{Y, Z) to the constraint in Fig. 14, we 
also add Y AZ so as not to make (D.Lab.Disj) applicable. 

Specifically, we have to be careful when labeling a variable 
like Xi in Fig. 15 (where grey arcs stand for path equality literals): Xi is in 



Xl 


Yj 


-• 




: X 


X’: 


• 


• 


: X. 


Y. ■ 


-• 2 


2#-l 



Xj/Xj-Yj/Yj 



Fig. 15. 

Extension 
and parallelism 
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(1) Eliminating/introducing a local variable 

x=z A (p (p ifxgg,xg V(¥>), z e V((p) 

(2) Renaming a local variable 

v>=‘g°‘^ ^[Y/x] ifx^g,Y^v(^)ug 

(3) Exchanging representatives of an equivalence class in a constraint 

X=Y A (p X^Y A (p[y/X] 

(4) Set equivalence (associativity, commutativity, idempotency) 

Ip if (p Ip 

Fig. 17. The equivalence relation =g^ on constraints handling local variables 



betw,^(Xi, X 2 ), and when we add Xi:g(X) for some unary g, we also have to 
add X 2 -g{X'), otherwise (P.Copy.Lab) would be applicable. 

So, by adding a finite number of atomic constraints and without adding any 
new local variables, we can label at least one further unlabeled variable in the 
constraint, while keeping it in UPiV-solved form. Thus, if we repeat this process 
a finite number of times, we can extend our generated constraint in UPiV-solved 
form to a simple generated constraint in UPA^solved form, from which we can 
then read off a solution right away. 

Theorem 1 (Soundness). A generated constraint in DPN-solved form is sat- 
isfiable. 



6 Completeness 



UPiV-saturation is complete in the sense that it computes every minimal solved 
form of a parallelism constraint. For parallelism constraints, the set inclusion 
order we have used previously is not sufficient; we adapt it such that it takes 
local variables into account. 





,X, Y„ 






,X Y, 


^g 




*g 


- 1 


,X,Y. 


i - 



X,/X,- Y, /Y, 



Consider Fig. 16. If (N.New) is applied to X first, this 
yields p ^,) for a new local variable X' , plus Yi:g{X') 
and X'=Y by (P.Copy.Lab) and (D.Eq.Decom). Accordingly, 
if (N.New) is applied to Y first, we get p (y( AXi:g(Y') A 
Y'=X for a new local variable Y' . The nondeterministic choice 
in applying (N.New) leads to two UPA-solved forms incom- 
parable by C which, however, we do not want to distinguish. 

To solve this problem, we use an equivalence relation han- 
dling local variables: Let G Pf V, then is the smallest 
equivalence relation on constraints satisfying the axioms in Fig. 17. From this 
equivalence and subset inclusion, we define the new partial order <g. 



Fig. 16. Local 
variables? 



Definition 6. For Q C V let <g be the reflexive and transitive closure 

(cu^-)*. 
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We also write =q for <g n >g. We return to our above example concerning 
Fig. 16. Let g = {Xi,X2,Yi,F2,^,n- Then X^-.g{X) ^Yl■.g{Y) ^Yl■.g{X') A 
X'=Y =^g‘^ Xi:g{X) A Yi:g(Y) A X'=Y by axioms (3) and (4). This, in turn, is 
=g‘^ equivalent to Xi:g{X) A Y:g{Y) by axiom (1). Again by axiom (1), this is 
=g equivalent to Xi:g{X)AYi:g{Y)AY'=X, which equals Xi:g{X)AXi:g{Y')A 
Yi:g(Y) A Y'=X by axioms (4) and (3). 



Lemma 5. The partial order <g can he factored out into the relational compo- 
sition of its components, i.e., <g is C o =g'^. 



Lemma 6. If ip <g ip and if is a DPN-solved form, then there exists a DPN- 
solved form p' such that p C ip' if. 



Lemma 7. Let p he a constraint, G CV, and if a DPN-solved form with p <g 
if. If a rule p G DPN is applicahle to p, then there exists a constraint p' satisfying 
p ^{p} p' and p' <g if. 

Proof. By Lemma 6 there exists a UPA-solved form if' with p if' if. 
First, suppose p is a rule p yf^fPi in DP. Then there exists an i such that 
Pi C if', hence p A p^ C if' . Now suppose that p G N: Let p be p — > P ( x/ ) 
with X' ^ g U V{p). Then P ( y ) G if for some variable Y. But then by 
axiom (2) of Fig. 17, we have if' =^g" if'\Z' /X'] for some Z' ^ Qyj V{if') U V{p), 
which by axiom (1) is =’g‘' equivalent to if' [Z' / X'] AY=X' , which in turn equals 
if'[Z'/X'] A Y=X' A P ( x^ X/) by axiom (3). Call this last constraint if", then 

It remains to show that there exists a UPA-branch of finite 
length from p to each of its minimal solved forms. If saturation 
rules can be applied in any order, N can speculatively generate an 
arbitrary number of local variables. For example, for the constraint 
in Fig. 18, it could successively postulate p (y^ yl), p (yl yl), 

.... We solve this problem by choosing a special rule application 
order in our completeness proof: After each step, we first form 
a HP-saturation before considering another rule from N. We use 
a distance measure between a smaller and a larger constraint to 
prove completeness for DPN saturation obeying this application 
order. The two elements of the measure are: the number of distinct 
variables in the larger constraint not present in the smaller one; and the minimum 
number of correspondences still to be computed for a constraint. 

Definition 7. We define the number lc(5, p) o/ lacking correspondents in p for 
a set S C V{p) hy 

lc(5, p) = Y. {lcx^y^(^, P) + p)\XGS and X^fX^^Y^lY^ G p} 







X,/X,-Y,/Y, 

Fig. 18. 

Termina- 

tion? 




122 



Katrin Erk and Joachim Niehren 



where we fix the values of the auxiliary terms be setting for all W, U, U' , V, V € 
I uu' \ ^ if W € betw„([/, V) and p ) is not in ip for any W 



Definition 8. For constraints ipiQp2, let (^ 2 ) be the size of the set {X S 

V{(p2) I X^Y G p2 for all Y G V{pi)}. 

We call a set S C V{ip) of variables an inequality set for ip iff X^Y G p for 
any distinct X,Y G S. 

For constraints p2 that are saturated with respect to (P.Distr. Project), 
diff(i^i, (^2) is the number of variables X in p2 such that X=Y p2 for all 
YGV{pi). 

Definition 9. Let p, if be constraints and Q C V with p <g if. Then the Q- 
measure pg{p,if) for p and if is the sequence [pg{p,'ip), fjf{p)) , where: 

— p,g{p,'if) = min{d\f{{p,if') \ p C f;' =g‘^ ^ and 4 >' is DPN-solved } 

— pf{p) = mzn{lc(5, tp) | 5 zs a maximal inequality set for p}. 

We order ^-measures by the lexicographic ordering < on sequences of natural 
numbers, which is well-founded. The main idea of the following proof is that 
after each step and subsequent UP-saturation, the ^-measure between a 
constraint and its solved form has strictly decreased. 

Theorem 2 (Completeness). Let p be a constraint, fz C V, and ip a <g- 
minimal DPN-solved form for p. Then there exists a DPN-solved form ip' =g ip 
which can be reached from p, i.e. p ^DPN "'I’' ■ 

Proof. W.l.o.g. let p be PP-closed. If no rule from N is applicable to p then 
p =g Ip by the minimality of ip. If a rule p G N is applicable to p, then by 
Lemma 7 there exist p',p" such that p ^{p} p" ^dp p' Yg ‘ip, and p' is 
PP-saturated. By induction, it is sufficient to show that p,g{p',ip) < p.g{p,ip). 
Note that because p is PP-closed, a maximal inequality set within p contains 
exactly one variable from each syntactic variable equivalence class represented in 
p] and lc({X}, p) = lc({F}, p) whenever X=Y G p because of saturation under 
(P.Path.Eq.l). The value of diff(i^, ip') is minimal for ip' if for any Y G V{ip') that 
is inequal to all X G V{p), Y is locab and there is no other variable Z G V(ip') 
distinct from Y with Y=Z G ip'. 

Let p" be (^A p ]^ 1 ). In p' , (P.Distr. Project) has been applied to X' and 
all variables in V{p). Let ip' =^g" ip with p C ip' and minimal A\d{p,ip'). The 
constraint ip' contains P for some Z. W.l.o.g. we pick a ip' that does not 

contain X' . 

^ The variable Y is local because V{ip') Pi S — V(ip) P S = V(p) H C/, otherwise the 
value of diff(p,ip') would not be minimal for ip' . 
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(T. Trans. H) 


P(nv) 


A 


P(vw)- 


P(uw) 






(T.Trans.V) 




) A 




- P(x^?^) 






(T.Diff.l) 


P X 2 Y 2 


) A 




A ^3 A 


Y2<*Ya ^ 


(X 2 Y2 

P 1 X 3 Y 3 


(T.Diff.2) 


P '■ X 3 Yj 


) A 




A a 


Yi<*Y2 ^ 


P X 2 Y 2 



Fig. 19. Optional propagation by rules T\ transitivity of path equalities 



— If X'=Y G (p' for some Y G V{(p), then and pg{(p\^) = 

Pg{(p',ip): \c{{V},ip') < lc({X},(^) whenever V=X G p', and either X or 
some other member of its equivalence class must be in each maximal inequal- 
ity set. At the same time, a maximal inequality set within p' can contain 
only one of X' and Y, so X' contributes nothing additional to 

Let f/'” be ip’ A X'=Z A ip'[X' /Z], Then ip” is UPA-solved, and p' C ip” . 
We have difF(i^', ^/>") = d\ff{ip,ip') because for any V ^Y G ip', ip” contains 
V^Y f\V^X' . Furthermore, diff((^', ip”) is minimal because the only variable 
in Ip” not in ip' is X' . 

— If X'^Y G <p' for all Y G V{(p), then pg{ip',ip) < pg{ip,ip): Let ip” be 
ip'[X' /Z], Thus, Ip' =g" Ip” by axiom (2) and because Z must be local, 
and Z=Z' is not in ip” for any distinct Z' because of the minimality of 
d\d{ip,ip'), as pointed out above. Obviously ip” is a UPA-solved form with 
p>' C Ip”. Furthermore, d\d{tp' ,ip”) = d\d{tp,ip) — 1 because we must have 
had Z^V G Ip' for all V G Vip)- 

7 Improving Propagation 

In this section, we extend DPN by the set of optional propagation rules T, which 
states transitivity properties of path equalities. Using these rules, much (but not 
all) of what (P.Distr. Project) achieves by case distinction can be gained through 
propagation. Interestingly, though, we have not yet encountered a linguistically 
relevant example requiring (P.Distr. Project). 

The set of saturation rules T comprises all instan- 
ces of the schemata in Fig. 19. Scheme (T. Trans. H) 
describes horizontal transitivity of path equality con- 
straints, while (T. Trans. V), (T.Diff.l) and (T.Diff.2) all 
deal with vertical transitivity. The correctness of these 
rules is obvious. 

By generatedness, each path equality infered by 
DPN saturation describes a correspondence for some 
parallelism literal. With T, this is different. Consider, 
for example. Fig. 20 where DPN saturation can in- 
fer the correspondence (P.Root) yields 

P iu\ v\y (T.Trans.V) can add p (y) ^^), apath 
equality that does not describe any syntactic correspondence for any of the two 
parallelism literals present. Actually, the reason why we record correspondence 





■'t: 


lui 


v,I 


; X, 






2®- 


®U2 


Vz® 



Xj/Xj- Yj /Y, 

u,/u,~y/v. 

Fig. 20. Vertical tran- 
sitivity 
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X / Y ~X / Y 
2 2 3 3 

VY3-VY4 



X./Y - 

4 4 



X^/Y^ 



by path equalities, as quadruples of variables, is that they support transitivity 
rules and thus allow for stronger propagation. 

Fig. 21 shows an example where T-saturation 
can apply a propagation rule to make an infer- 
ence that would need case distinction by 
(P.Distr. Project) with UPiV-saturation alone. 

(N.New) can be applied to C/ G betw,^(Xi, Fi), 
yielding p for a new local variable U', 

and (P.Copy.Dom) adds X 2 <*U' <*[/". Applying 
(N.New) and (P.Copy.Dom) two more times each, 
we get first p ) and X^<fU”<fY 2 . and then 

P (y?/ ym) and A 4 <* {/'"<* I 4 . Do we now need 
to apply (N.New) to U'” in X^jY^^X\lY{l In 
DPNT, we do not: We get p ^f,) and then 
p ) by (T. Trans. H). Note in passing that 

p (^^ ) does not describe a correspondence for any parallelism literal in tp. 

In DPN, on the other hand, we would need to apply (N.New) to U'”, yielding 
p (j^f, yi), and then use (P.Distr. Project) to guess whether U=U"' or U^U'^. 
Only the first case leads to a terminating saturation process. 

A <g -minimal DPAT-solved form for a constraint p may differ from a <g- 
minimal DPA-solved one — there may be additional path equalities. To ensure 
that they do not differ in more aspects, we impose an additional minor condition 
(which could possibly be omitted): We apply (P.Copy.Dom) only if the path 
equalities p(y^yOj*= 1 ) 2 , describe a correspondence for some parallelism 
literal.^ 



Fig. 21. 

tivity 



Horizontal transi- 



Lemma 8. Let p he a eonstraint, G C V, and ip a <g -minimal DPN-solved 
form for (p. Then there exists a eonjuction if' of path equalities such that if A if' 
is a <g -minimal DPNT-solved form for p. 



Proposition 2 (Soundness). If p is generated and p V’ for a DPNT- 

solved form if, then if is satisfiable. 



Proposition 3 (Completeness). Let p he a eonstraint, G QV, and if a <g- 
minimal DPNT-solved form for p. Then there exists a eonstraint if' =g if in 
DPNT-solved form with p "'P' ■ 

Implementation. A first prototype implementation of DPNT is available as 
an applet on the Internet [3]. Saturation rules are applied in an order refining 
the order mentioned above: A distribution rule is only applied to a constraint 
saturated under the propagation rules from DPT. A rule from N is only applied 
to a constraint saturated under DPT. This implementation handles ellipses in 

2 I.e. we now have (P.Copy.Dom) U1RU2 A ALi P (ul v( ) Xx/X^^Yx/Y^ 
VxRV2, where RG {<i*,T,A} and Ux,U2 G betw^ 3 (Ai, A2). 
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natural language equally well as the previously mentioned implementation based 
on context unification [17]. But the two implementations differ with respect to 
scope ambiguities, i.e. dominance constraint solving: While the context unifica- 
tion based program could handle scope ambiguities with at most 3 quantifiers, 
the parallelism constraint procedure resolves scope ambiguities of 5 quantifiers 
in only 6 seconds and can even deal with more quantifiers. 

8 Conclusion 

We have presented a semi-decision procedure for parallelism constraints which 
terminates for the important fragment of dominance constraints. It uses path 
equality constraints to record correspondence, allowing for strong propagation. 
We have proved the procedure sound and complete. In the process, we have 
introduced the concept of a minimal solved form for parallelism constraints. 

Many things remain to be done. One important problem is to describe the 
linguistically relevant fragment of parallelism constraints and see whether it is 
decidable. Then, the prototype implementation we have is not optimized in any 
way. We would like to replace it by one using constraint technology and to see 
how that scales up to large examples from linguistics. Also, we would like to 
apply parallelism constraints to a broader range of linguistic phenomena. 
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Abstract. We consider the problem of higher-order matching restricted 
to the set of linear A-terms (i.e., A-terms where each abstraction Xx.M 
is such that there is exactly one free occurrence of x in M). We prove 
that this problem is decidable by showing that it belongs to NP. Then 
we prove that this problem is in fact NP-complete. Finally, we discuss 
some heuristics for a practical algorithm. 



1 Introduction 

Higher-order unification, which is the problem of solving a syntactic equation 
(modulo j3 or (3rj) between two simply typed A-terms is known to be undecidable 
[8], even in the second-order case [7]. On the other hand, if one of the two A- 
terms does not contain any unknown, the problem (which is called, in this case, 
higher-order matching) becomes simpler. Indeed, second-order [10], third-order 
[5], and fourth-order [17] matching have been proved to be decidable (See also 
[21] for a survey, including lower and upper bounds on the complexity). In the 
general case, however, the decidability of higher-order matching is still open. 

Since higher-order unification is known to be undecidable, it is natural to 
investigate whether one can recover decidability for some restricted form of it. 
For instance. Miller’s higher-order patterns form a class of A-terms for which 
unification is decidable [14]. Another restricted form of higher-order unification 
(in fact, in this case, second-order unification) is context unification [3, 4], which 
may be seen as a generalisation of word unification. While the decidability of 
context unification is open in general, some subcases of it have been shown to 
be decidable [3, 4, 11, 12, 19]. 

In this paper, we study higher-order matching for a quite restricted class of 
A-terms, namely, the A-terms that correspond to the proofs of the implicative 
fragment of multiplicative linear logic [6] . These A-terms are linear in the sense 
that each abstraction Xx.M is such that there is exactly one free occurrence 
of X in M . Our main result is the decidability of linear higher-order matching. 
Whether linear higher-order unification is decidable is an open problem related, 
in the second-order case, to context unification [11]. 

Linear higher-order unification (in the sense of this paper) has been investi- 
gated by Cervesato and Pfenning [2], and by Levy [11]. Cervesato and Pfenning 
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consider the problem of higher-order unification for a A-calculus whose type sys- 
tem corresponds to full intuitionistic linear logic. They are not interested in de- 
cidability results — they know that the problem they study is undecidable because 
the simply-typed A-calculus may be embedded in the calculus they consider — 
but in giving a semi-decision procedure in the spirit of Huet’s pre-unification 
algorithm [9]. Levy, on the other hand, is interested in decidability results but 
only for the second-order case, whose decidability implies the decidability of 
context unification. 

Both in the case of Levy and in the case of Cervesato and Pfenning, there 
was no reason for considering matching rather than unification. In the first case, 
the matching variant of the problem is subsumed by second-order matching, 
which is known to be decidable. In the second case, the matching variant of the 
problem subsumes full higher-order matching, which is known to be a hard open 
problem. 

Our own motivations in studying linear higher-order matching come from the 
use of categorial grammars in computational linguistics. Natural language pars- 
ing using modern categorial grammars [15, 16] amounts to automated deduction 
in logics akin to the multiplicative fragment of linear logic. Consequently, the 
syntactic structures that result from categorial parsing may be seen, through the 
Curry-Howard correspondence, as linear A-terms. As a consequence, higher-order 
unification and matching restricted to the set of linear A-terms have applications 
in this categorial setting. In [13], for instance, Morrill and Merenciano show how 
to use linear higher-order matching to generate a syntactic form (i.e., a sentence 
in natural language) from a given logical form. 

The paper is organised as follows. In the next section, we review several basic 
definitions and define precisely what is a linear higher-order matching problem. 
In Section 3, we prove that linear higher-order matching is decidable while, in 
Section 4, we prove its NP-completeness. Finally, in Section 5, we specify a more 
practical algorithm and we discuss some implementation issues. 

2 Basic Definitions 

Definition 1. Let A be a finite set, the elements of which are called atomic 
types. The set T o/ linear functional types is defined according to the following 
grammar: 

T :■.= A \ 



We let the lowercase Greek letters {a, /3, 7 , . . . ) range over T . 

Definition 2. Let be a family of pairwise disjoint finite sets indexed 

by T, whose almost every member is empty. Let and (3fa)ae.F be 

two families of pairwise disjoint countably infinite sets indexed by T , such that 
(Uae:^ (Uae:^ ya) = 0- The set T of raw X-terms is defined according to 

the following grammar: 

T ::= S \ X \ y \ XX. T \ (TT), 
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whcTC XJ — ^ ^cl ; O^Tld 3^ — 3^a ■ ® 

In the above definition, the elements of S are called the constants, the el- 
ements of X are called the X-variables, and the elements of y are called the 
meta-variables or the unknowns. We let the lowercase Roman letters (a, b, c, . . .) 
range over the constants, the lowercase italic letters (x, y, z, . . .) range over the 
A-variables, the uppercase bold letters (X, Y, Z, . . .) range over the unknowns, 
and the uppercase italic letters (M, N,0, . . .) range over the A-terms. The no- 
tions of free and bound occurrences of a A-variable are defined as usual, and we 
write FV(M) for the set of A-variables that occur free in a A-term M. Finally, a 
A-term that does not contain any unknown is called a pure X-term. 

If P = {io, zi, . . . , in} is a (linearly ordered) set of indices, we write Axp. M 
for the A-term Xxig. Xxi^. . . . Xxi^.M. Similarly, we write M (Np) or M {Ni)i^p 
for (. . . {{M Nig) Wi) ■ ■ As in set theory, we let n = {0, 1, . . . , n — 1}. 

These notations will be extensively used in Section 5. 

We then define the notion of typed linear A-term. 

Definition 3. The family of sets of typed linear X-terms is inductively 

defined as follows: 

1. if B, G Ea then a € Ta; 

2. ifN^G ya then X G Ta; 

3. if X G Xa then x G Ta; 

4- if X G Xa, M G Ti 3 , and x G FV(M), then Xx.M G Tj^a^p)! 

5. if M G T^a^d), N G Ta, and FV(M) nFV(fV) = 0, then {M N) G T^. ■ 

Clauses 4 and 5 imply that any typed linear A-term Xx. M is such that there is 
exactly one free occurrence of x in M . Remark, on the other hand, that constants 
and unknowns may occur several times in the same linear A-term. 

^From now on, we define T to be (which is a proper subset of 

the set of raw A-terms). It is easy to prove that the sets {Tafa^j^ are pairwise 
disjoint. Consequently, we may define the type of a typed linear A-term M to be 
the unique linear type a such that M G Ta- 

We take for granted the usual notions of a-conversion, /^-expansion, /3-redex, 
one step /3-reduction n step /3-reduction {^p), many step /3-reduction 

i^p), and /3-conversion (=p). We use Z\ (possibly with a subscript) to range 
over /3-redexes, and we write A G M to say that Z\ is a /3-redex occurring in a 
A-term M. 

We let M[x:=N] denote the usual capture-avoiding substitution of a A- 
variable by a A-term. Similarly, M[X:=/V] denotes the capture-avoiding sub- 
stitution of an unknown by a A-term. Note that any /3-redex (Xx. M) N occur- 
ring in a linear A-term is such that FV(Ax.M) n FV(/V) = 0. Moreover we 
may suppose, by a-conversion, that x ^ FV(/V). Consequently, when writing 
M[x\=N] (or M[X:=A^]) we always assume that FV(M) n FV(/V) = 0. Finally 
we abbreviate M[xg.=Nq\ ■ ■ ■ [a;„_i:=/V„_i] as M[xi:=Ni]i^n- 

In Section 5, we will also consider a more semantic version of substitution, i.e., 
a function a \ y ^ T that is the identity almost everywhere. The finite subset 
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of y where ct(X) X is called the domain of the substitution (in notation, 
dom{a)). It is clear that such a substitution a determines a unique syntactic 
substitution (X := cr(X))x 6 dom(o -)5 and conversely. 

We now give a precise definition of the matching problem with which we are 
concerned. 

Definition 4. A linear higher-order matching problem modulo f3 (respectively, 
modulo j3rj) is a pair of typed linear X-terms {M, N) of the same type such that N 
is pure (i.e., does not contain any unknown). Such a problem admits a solution if 
and only if there exists a substitution (Xj:=Oi)ig„ such that M[Xi:=Oi]ig„ =p 
N (respectively, M[Xj:=Oi]ig„ =pri N). U 

In the sequel of this paper, a pair of A-terms (M, N) obeying the conditions 
of the above definition will also be called a syntactic equation. Moreover, we will 
assume that the right-hand side of such an equation (namely, N) is a pure closed 
A-term. There is no loss of generality in this assumption because, for x G FV(JV), 
{M, N) admits a solution if and only if {Xx. M, Xx. N) admits a solution. 



3 Decidability 

We first define the size of a term. 

Definition 5. The size \M\ of a linear X-term M is inductively defined as fol- 
lows: 

1. |a| = l 

2. jX| = 0 

3. |a;| = 1 

1 \Xx.M\ = \M\ 

5. |M7V| = |M|-h |fV| ■ 

The set of linear typed A-term is a subset of the simply typed A-terms that is 
closed under /3-reduction. Consequently it inherits the properties of confluence, 
subject reduction, and strong normalisation. This last property is also an obvious 
consequence of the following easy lemma. 

Lemma 1. Let M and N be two linear typed X-terms such that M N . Then 
\M\ = |X| + 1. 

Proof. A direct consequence of the fact that any A-abstraction Xx.M is such 
that there is one and only one free occurrence of x in M. □ 

As a consequence of this lemma, we obtain that all the reduction sequences 
from one term to another one have the same length. 

Lemma 2. Suppose that M and N are two linear typed X-terms such that M 
N and M N . Then m = n. 



Proof. By iterating Lemma 1, we have n = |M| — |A^| = m. 



□ 
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The above lemma allows the following definition to be introduced. 

Definition 6. The reducihility degree i'{M) of a linear typed X-term M is de- 
fined to he the unique natural number n such that M N , where N is the 

P-normal form of M . ■ 

We also introduce the following notions of complexity. 

Definition 7. The complexity p{a) of a linear type a is defined to be the number 
of “—o” it contains: 

1. p(a) = 0 

2. p{a -o j3) = p{a) p{f3) 1 

The complexity p{A) of a (3-redex A = (Ax. M) N is defined to be the complexity 
of the type of the abstraction Ax. M. ■ 



Lemma 3. Let M G Ta^p, and N G Ta. Then 

p{A) < Y p(^)+ Y P(^) + P(«^/^)- 

ACMN ACM ACN 

Proof. In case M is not an abstraction, we have YIacm n ~ TIacm p(^) + 
Y.acnP{^)- Otherwise we have jv P(^) = 'E acm P{^) + Y. acn pl^) + 

p{a -o (}), because M N itself is a /3-redex whose complexity is p{a -o f}). □ 



Lemma 4. Let M gT, and N,x gT^ he such that x G FV(M). Then 

Y p(^) ^ Y p(^)+ Y p(^)+pi^)- 

AcM[x-.=N] ACM AcN 

Proof. By induction on the structure of M. The only case that is not straight- 
forward is when M = xO and N is an abstraction. In this case, one additional 
redex of complexity p{a) is created. □ 

We now prove the key lemma of this section. 

Lemma 5. Let M be any linear typed X-term. The following inequality holds: 

HM) < Y p(^) ( 1 ) 

AcM 

Proof. The proof is done by induction on the length of M. We distinguish be- 
tween two cases according to the structure of M. 

M = £_Nq . . . Nn-i where ^ = a, ^ = X, or ^ = x, and where the sequence 
of terms is possibly empty. We have that v{M) = 

that J2acmP(^) = J2icnT,AcNi P(^)- Consequently, (1) holds by induction 
hypothesis. 
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M = (Ax. N) Nq . . . Nn-i- Let the type of Ax. N he a -o (3. li n = 0 (i.e., the 
sequence of terms {Ni)i^n is empty) the induction is straightforward. If n = 1, 
we have: 

v{M) = iy{N[x:=No]) + 1 

- '^AcN[x := ATo] P(^) + 1 

- J2acN + ^AcNo ^ 

- Y^AcN P(^) + Y^AcNo p(^') + P(°^ 

= J2agm p(^) 

Finally, if n > 2: 

ly(M) = i^{N[x:=No] Ni... fV„_i) + 1 

^ EziciV[a=:=iVo] iVi ... iv„_i p(^) + 1 (by induction hypothesis) 

= SziCAf[a::=Aro] Ni P(^) + J2ien-2 ^AcNi+2 ^(^) + ^ 

— X/ZiCAf[a::= No] P(^) + P(f^) + Eien -1 EziciVi+i P(^) + 1 (by lemma 3) 
< Y.ACN p(^) + p(«) + pW + Eien Y.Acm p(^) + 1 (by lemma 4) 
= J2acN P(^) + d(“ P) + J2ien JP^AcNi d(^) 

= J2acm p(^) bl 

Linearity plays a central role in the above Lemmas. In fact, Lemmas 1, 2, 
4, and 5 do not hold for the simply typed A-calculus. This is quite clear for 
Lemmas 1 and 2. Moreover, without the latter. Definition 6 does not make sense. 
Nevertheless, one might try to adapt this definition to the case of the simply 
typed A-calculus by defining the reducibility degree of a A-term M to be the 
maximal natural number n such that M N (where N is the /3-normal form 
of M). Lemma 4 could also be adapted by taking into account the number of free 
occurrences of x in M. But then, any attempt in adapting Lemma 5 would fail 
because linearity does not play a part only in the statement of this last lemma, 
but also in its proof. Indeed, this proof is done by induction on the length of a 
term M and, in case M = (Ax. N) Nq, we apply the induction hypothesis to the 
term A^[x:=A^o]- Now, if x occurs more than once in N, there is no reason why 
the length of A^[x:=fVo] should be less than the length of (Ax. N) Nq. 

We now prove the main result of this paper. 

Proposition 1. Linear higher-order matching (modulo P) is decidable. 

Proof. We prove that the length of any possible solution is bounded, which 
implies that the set of possible solutions is finite. 

Let {M, N) be a linear higher-order matching problem, and assume, without 
loss of generality, that M and N are /3-normal. Let (Xi)jg„ be the unknowns 
that occur in M, and suppose that (Xj:=Oi)ig„ is a solution to the problem 
where the Ofs are /3-normal. By Lemma 1, we have: 

j/(M[X,:=OJie„) = |M[Xi:=OJie„| - |3V| (1) 

Let Ui be the number of occurrences of X^ in M. Equation (1) may be rewritten 
as follows: 

j.(M[X,:=OJie„) = ^ n,\0,\ + \M\ - |/V| 

iGn 



(by induction hypothesis) 
(by lemma 4) 



(2) 
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On the other hand, by Lemma 5, we have: 

i.(M[X,:=0,]ie„) < ^ p(^) (3) 

ZieM[Xi :=Oi]ien 

Since M and are /3-normal, the /3-redexes that occur in M[Xi:=OJig„ 

are created by the substitution, which is the case whenever some Oi is an ab- 
straction and some subterm of the form X^ P occurs in M . Consequently, we 
have: 



XI P{^) <^nip{ai) (4) 

where ai is the type of the unknown X^. ^From (3) and (4), we have 

iy{M[Xi:=Oi]ien) <'^nip{ai) (5) 

iGn 

Finally, from (2) and (5), we obtain 

'^ni\Oi\<\N\-\M\+'^mp{ai) ( 6 ) 

i^n i^n 

which gives an upper bound to the length of the solution. □ 

As a corollary to this proposition, we immediately obtain that higher-order 
linear matching modulo tjP is decidable because the set of /^-expanded /3-normal 
forms closed by abstraction and application is provably closed under substitution 
and /3-reduction [10]. 

4 NP-Completeness 

Note that the upper bound given by Proposition 1 is polynomial in the length 
of the problem. Moreover, /3- and /3?7-conversion between two pure linear A- 
terms may be decided in polynomial time since normalisation is linear. Hence, 
a non deterministic Turing machine may guess a substitution and check that 
this substitution is indeed a solution in polynomial time. Consequently, linear 
higher-order matching belongs to NP. In fact, as we show in this section, it is 
NP-complete. 

Let S be an alphabet containing at least two symbols, and let A be a count- 
able set of variables. We write S* (respectively. S'*') for the set of words (respec- 
tively, non-empty words) generated by S. We denote the concatenation of two 
words u and v hy u- v. The next proposition, which states the NP-completeness 
of associative matching, is due to Angluin [1, Theorem 3.6]. 

Proposition 2. Given v € (S U A)* and w G S*, deciding whether there exists 
a non-erasing substitution cr : A ^ such that a{v) = w is NP-complete. □ 
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We need a slight variant of this proposition in order to take erasing substi- 
tutions into account. 

Proposition 3. Given f G (S U X)* and w G S*, deciding whether there exists 
a substitution a \ X such that a(v) = w is NP-complete. 

Proof. Consider a symbol, say #, that does not belong to S. For any word 
■u G (S U X)*, let u be inductively defined as follows: 

1 . e = e, where e is the empty word; 

2. a ■ u ' = # • a • u', where a G (S U df), and u' G (S U X)* . 

Let w' G (S U X)* and w' G Y*. It is almost immediate that there exists a 

substitution a : X ^ Y~^ such that cr(u') = w' if and only if there exists a 

substitution t : fb — > (S U {#})* such that t{v') = w'. □ 

In order to get our NP-completeness result, it remains to reduce the problem 
of the above proposition to a linear higher-order matching problem. The trick is 
to encode word concatenation as function composition. 

Proposition 4. Linear higher-order matching is NP-complete. 

Proof. Let i G ^ be an atomic type, and let 27t_ot = Y and = X. Finally, 
let X G X^. For any word u G (S U df)*, we inductively define u as follows: 

1. €= Ax. X, where e is the empty word; 

2. a ■ u ' = Ax. a (u/ x), where a G (S U X), and u' G (YU X)* . 

It is easy to show that there exist a substitution a : X ^ Y* such that (j(u) = w 
if and only if the syntactic equation (v, w) admits a solution (modulo fd, or 
modulo (dp). □ 

The existence of a set of constants seems to be crucial in the above 

proof. Indeed, contrarily to the case of the simply typed A-calculus, there is an 
essential difference between constants and free A- variables. Clause 4 of Definition 
3 implies that there is at most one free occurrence of any A-variable in any 
linear A-term. There is no such restriction on the constants. Consequently, a 
given constant may occur several time in the same linear A-term. This fact is 
implicitly used in the above proof. 



5 Heuristics for an Implementation 

In this section, we give a practical algorithm obtained by specialising Huet’s 
unification procedure [9] . We first specify this algorithm by giving a set of trans- 
formations in the spirit of [11, 20]. These transformations obey the following 
form: 



(*) 



e 



( Se , fJe ) 
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where e is a syntactic equation, Se is a set of syntactic equations, and (jg is a 
substitution. Transformations such as (*) may then be applied to pairs {S, a) 
made of a set S of syntactic equations, and a substitution a: 

{S, <t) > ((Te((<S' \ {e}) U Se), (Jg o (j) (**) 

provided that e G S'. By iterating (**), one possibly exhausts the set S: 

{S, a) — (0, r), 

in which case r is intended to be a solution of the system of syntactic equations 

S. 

For the sake of simplicity, we specify an algorithm that solves the problem 
modulo P?). The set of transformations is given by the three schemes listed below. 
All the A-terms occurring in these schemes are considered to be in /^-expanded 
/3-normal forms. 

1. Simplification: 

e = (A^p.a(MQ), Asp.a(/VQ)) 

Se = { (ATp,. Mi, ATp,. /Vi, ) I z G Q } 

CTg = id 

provided that FV (Mi) = FV (/Vi), and where a is either a constant or a bound 
variable, and the family of sets (Pi)igg is such that {xj | j G Pi } = FV(Mi). 

2. Imitation: 



e = (ATp.X(Mq), ATp.a(/Vp)) 

Se = {e} 

CTg = (X := Ayq.a(Az„,. Yi(yqJ(z„J)ieR) 

where (Qi)ieR is a family of disjoint sets such that UienQi = Qi (Yi)ieR is 
a family of fresh unknowns whose types may be inferred from the context. 

3. Projection: 

e = (ATp.X(Mq), ATp.a(7VR)) 

Se = {e} 

CTg = (X := Ayq. j/fc(Az„,. Yi(yqJ(z„J)ig„) 

where /c G Q, m is the arity of yt, (Qi)iem is a family of disjoint sets such that 
Ue™Q. = Q\W,(Y.)ie™ is a family of fresh unknowns of the appropriate 
type. 



We now sketch the proof that the above set of transformations is correct and 
complete. 
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Lemma 6. Let e — > {Se,<Je) be one of the above transformations. Let S be a 
set of syntactic equations such that e G S . Lf a is a substitution that solves all 
the syntactic equations in ae{{S \ {e}) U Se) then a o ae solves all the syntactic 
equations in S. 

Proof. It suffices to show that crocrg solves e whenever a solves ae{Se), which is 
trivial for the three transformations. □ 



Lemma 7. Let S and T be two sets of syntactic equations, and let t be a sub- 
stitution such that (S,id) — (T,t). Lf a is a substitution that solves all the 
syntactic equations in T then a o t solves all the syntactic equations in S. 

Proof. By iterating the previous lemma. □ 

As a direct consequence of this lemma, we obtain the correctness of the 
transformational algorithm. 

Proposition 5. Let e = {M, N) be a syntactic equation and a be a substitution 
such that {{e},id) — (0,a). Then a{M) = N. □ 

We now prove the completeness of our algorithm. 

Proposition 6. Let e = {M, N) be a syntactic equation and a be a substitu- 
tion such that u{M) = N . Then, there exists a sequence of transitions such that 
{{e},id) — (0,(t'), and o' agrees with a on the set of unknowns occurring in 
M. 

Proof. Let S' be a non-empty set of syntactic equations and let ct be a substitu- 
tion that solves all the equations in S. One easily shows — see [20, Lemmas 4.16 
and 4.17], for details — that there exists e G S together with a transformation 

e > {Se, (Je), (1) 



and a substitution t such that: 

1. a = T O (Je, 

2. r solves (Je((S \ {e}) U Se). 

Consequently, by iterating (1) on some system Rq that is solved by some sub- 
stitution (TO) one obtains a sequence of transitions: 

{Ro, id) — > {Ri, pi) — > (i? 2 , P 2 ) — *■ • • • (2) 

together with a sequence of substitutions (cto, cti, (T 2 , . . . ) such that: 

1 . fjQ = (7i O pi, 

2. (Ti solves Ri. 
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It remains to prove that (2) eventually terminates and, therefore, exhausts 
the set Rq. To this end, define the size of a system S (in notation, |S'|) to be the 
sum of the sizes of the right-hand sides of the syntactic equations occurring in S. 
Define also the size of a substitution a with respect to a system S (in notation, 
|(t|s) to be the sum of the sizes of the terms substituted for the unknowns 
occurring in S. Transformation (1) and substitution r are such that 

\ t\t < kis, 

where T = ae{{S \ {e}) U ^e). It is then easy to show that each transition of (2) 
strictly decreases the pair (|i?i|, according to the lexicographic ordering. 

□ 

The three transformations we have given specify a non-deterministic algo- 
rithm. Its practical implementation would therefore appeal to some backtrack- 
ing mechanism. This is not surprising since we have proved linear higher-order 
matching to be NP-complete. Nevertheless, some source of non-determinism 
could be avoided. We conclude by discussing this issue. 

A naive implementation of the transformational algorithm would give rise to 
backtracking steps for two reasons: 

1. the current non-empty set of syntactic equations is such that no transforma- 
tion applies; 

2. the size of the current substitution is strictly greater than the upper bound 
given by proposition 1. 

It is easy to see that the first case of failure can be detected earlier. Indeed, if no 
transformation applies to a system S, it means that all the syntactic equations 
in S have the following form: 

{Xxm-a{Mn), Xxm-h{No)) (3) 

where either a k b, or a = b but there exists k G n such that FV(Mfc) k 
FV(A^fc). Now, it is clear that such equations cannot be solved. Consequently, 
one may fail as soon as a system S contains at least one equation like (3). In 
addition, one may easily prove that any application of the simplification rule does 
not alter the set of possible solutions. Therefore simplification may be applied 
deterministically. These observations give rise to the following heuristic: 

Start by Applying repeatedly simplification until all the heads of the left- 
hand sides of the equations are unknowns. If this is not possible, fail. 

This heuristic is not proper to our linear matching problem. In fact, it belongs 
to the folklore of higher-order unification. We end this section by giving some 
further heuristic principles that are specific to the linear aspects of our problem. 

The next three lemmas, whose elementary proofs are left to the reader, will 
allow us to state another condition of possible failure that we may check before 
applying any transformation. Let denote the number of occurrences of a 

given constant “a” in some A-term M. Similarly, let denote the number 

of occurrences in M of some unknown X. 
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Lemma 8. Let M and N he two linear X-terms, and let x G FV(M). Then, for 
every constant a, ffa{M[x:=N]) = □ 

Lemma 9. Let M and N he two linear X-terms such that M N. Then, for 
every constant a, □ 

Lemma 10. Let M and N he two linear X-terms, and X he some unknown. 
Then, for every constant a, ffa{M[X.:=N]) = x □ 

As a consequence of these lemmas, we have that the number of occurrences of 
any constant in the left-hand side of any equation cannot decrease. This allows 
us to state the following failure condition. 

Check, for every constant a, that each equation {M, N) is such that 
#a(Af) < Lf this is not the case, fail. 

The above condition may be checked before applying any transformation, 
and then kept as an invariant. To this end, it must be incorporated as a proviso 
to the simplification rule. Then, the choice between imitation and/or projection 
must obey the following principle: 

When considering an equation such as 

(ATp.X(Mq), ATp.a(TVR)) 

check whether there exist some equation {A, B) (including the above equa- 
tion) such that #a(A) -I- #x(A) > ffg,{B). Lf this is the case, projection 
is forced. Otherwise, try imitation before trying projection. 

The reason for trying imitation first, which is a heuristic used by Paulson in 
Isabelle [18], is that each application of imitation gives rise to a subsequent 
simplification. 

When applying imitation, we face the problem of guessing the family of sets 
(Qi)ieR- This source of non-determinism is typical of linear higher-order unifica- 
tion.^ Now, since any application of imitation may be immediately followed by 
a simplification, the family (Qi)ieR should be such that the subsequent simpli- 
fication may be applied. We now explain how this constraint may be satisfied. 
Consider some linear A-term A whose /^-expanded /3-normal form is: 

ATp. a(Mq) 

where a is not a bound variable. We define the incidence function of A to be the 
unique /^ : P ^ Q such that: 

fA{i)=j if and only if XiGFY{Mj). 

^ It is due to the multiplicative nature of the connective and is reminiscent of 
the context-splitting problem one has to solve when trying to prove a multiplicative 
sequent of the form F i- A® B hj a backward application of the (^-introduction 
rule. 
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Now, consider the two following A-terms: 



A = Axp.X(Mq) 
B = XxQ.a{Np) 



It is not difficult to show that the incidence function of A\X-.=B] is such that: 

/a[X:=b] = Ib ° f a (4) 

We will apply the above identity to the case of the imitation rule. Let A, B, 
and C be the terms involved in the definition of an imitation step: 

A = Xxp.X(Mq) 

B = Xxp. a,{Np) 

C = Ayq.a(Az„,. Yi(yqJ(z„J)igR 

After the imitation step is performed, equation (A, B) is replaced by equation: 



(A[X:=C'],R) 


(5) 


Simplification may then be applied to (5) provided that 




/a[X:=c] = /b 


(6) 


which, by (4), is equivalent to 




fc ° fA = /b 


(7) 



Note that both /a and Jb are known, while fc is uniquely determined by the 
family of sets (Qi)igR, and conversely, since 

fc{i) = j if and only if yi G Qj 

Therefore, in order to find an appropriate family of sets (Qi)igR, it suffices to 
solve (7). Now, it is an elementary theorem of set theory that (7) admits a 
solution if and only if 

(Vt, j e P)/a(*) = IaU) /b(*) = /bOX 

which gives a condition that may be checked before applying any imitation. 
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Abstract. As a minimal environment for the study of permutative re- 
ductions an extension A3 of the untyped A-calculus is considered. In 
this non-terminating system with non-trivial critical pairs, confluence is 
established by studying triangle properties that allow to treat permuta- 
tive reductions modularly and could be extended to more complex term 
systems with permutations. Standardization is shown by means of an in- 
ductive definition of standard reduction that closely follows the inductive 
term structure and captures the intuitive notion of standardness even for 
permutative reductions. 



1 Introduction 

Standardization and confluence are phenomena that appear naturally in un- 
typed term rewrite systems such as pure A-calculus. Permutative reductions 
usually are studied only in typed systems, e.g., with sum types (see for instance 
[4]). The calculus A3 extends A-calculus by a generalized application rS that 
gives rise to permutative reductions of the form {rS)T r(S'{T}) already in 
the untyped, non-normalizing setting. The resulting term rewrite system is also 
not orthogonal, rendering confluence and standardization demanding problems. 
Nevertheless, there is a perspicuous, modular and extensible solution for them, 
so in fact A3 serves as a minimal model for the study of permutation. 

The calculus A3 has been presented (and its simply-typed version proven strongly 
normalizing) in [5] as the untyped core of a notation system for von Plato’s gen- 
eralized natural deduction trees [14]. It enjoys a particularly simple characteri- 
zation of normal forms 

r,s,t::=x \ xR \ Xxr , R ::= (s, z.t) . 

With simple types assigned, this grammar represents the cut-free derivations in 
Gentzen’s sequent calculus LJ: 

Th s: A r,z:B'rt:C T,x: A 'rr-.B 

r,x : A\- X : A T,x \ A ^ B \- x(s, z.t) : C ^ ^ B \- Xxr : A ^ B 

L. Bachmair (Ed.): RTA 2000, LNCS 1833, pp. 141-155, 2000. 

@ Springer- Verlag Berlin Heidelberg 2000 
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AJ generalizes rule (L^) by allowing arbitrary terms r instead of the variable x 
in a;(s, z.t). Therefore, ylJ can be understood as the closure of the computational 
(i.e., type-free) content of cut-free LJ under substitution — hence the name ylJ. 
It also incorporates A-calculus by setting rs := r(s, z.z). 

Standardization. The standardization theorem (see, e.g., [1]) establishes that 
reduction sequences from one term to another can constructively be transformed 
into a standard reduction sequence. For the pure A-calculus, henceforward called 
A, the latter concept can be put intuitively as follows: A redex {Xxr)sti . . . is 
never converted after reductions in r, s or the ti have been performed. Formally, 
this is usually expressed by requiring that redexes are executed in a strictly left- 
to-right order as recognized by the position of the A of a redex (Aa;r)s in the 
string representation of terms. This also imposes the restriction that a standard 
reduction sequence starting with xrs has to perform all reductions on r before 
those on s, although this is not contained in the intuitive notion given above, 
which rather implies that the reductions in r and s are independent of each 
other. There is even more freedom in the presence of permutative reductions: A 
permutative redex of the form {Xxr)RST may permute to {Xxr){R{S})T and 
to {Xxr)R{S{T}) and both possibilities as well as the embedded /3-redex should 
be treated equivalently. 

We therefore use an inductive definition of standard reduction that closely 
follows the inductive term structure in a canonical way and captures the intuitive 
notion of standardness even for permutative reductions.^ The standardization 
theorem states and thereby yields a new induction principle for — >*. As 

a prototypical application we prove a syntax-directed inductive characterization 
of the weakly normalizing terms of ylJ.^ 

Confluence. The combination of /3-reduction and permutative reductions leads 
to a system with non-trivial critical pairs, ^ i.e., a non-orthogonal term rewrite 
relation. Although the critical pairs can be joined and hence the reduction rela- 
tion is locally confluent by the Critical Pair Lemma [8] , this does not suffice for 
full confluence, as the calculus AJ is not (strongly) normalizing and so Newman’s 
lemma is not applicable. Since the critical pairs are not development closed (not 
even almost, cf. [12]), we also cannot use the extensions of Huet’s and Toyama’s 
results in [11]. 

However, permutations always converge and the interaction between /3-reduc- 
tion and permutation is not too intricate. This allows to prove a commutation 
property between /3-developments and 7r-reduction. Using sequential composition 
(instead of the union) of those two relations, we can establish a development 
for the combined reduction relation and confluence ensues. This proof enjoys a 
particular modularity in that it derives the triangle property of a combination 

^ After our presentation of such a notion for A in spring 1998, Tobias Nipkow kindly 
pointed us to Loader’s [6] where this notion had been developed independently. 

^ For A such a characterization has been worked out in [13]. 

® A with ? 7 -reduction has two trivial critical pairs arising from (Xx.rx)s (^rj H 
and Xx.{Xx.rx)x (e^,, Pi Xx.rx for x not free in r. 
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of reduction relations from the respective property of each part, analogous to 
the Hindley-Rosen-Lemma for confluence. 

As will be shown elsewhere, the method carries over to other systems with 
permutative reductions as long as they enjoy an introduction/elimination di- 
chotomy under the natural deduction (Curry-Howard) interpretation, e.g., cal- 
culi with sum types and even infinitary terms with the w-rule [9, 7]. 

Acknowledgements to Henk Barendregt, Herman Jervell, Tobias Nipkow, Vin- 
cent van Oostrom, Martin Ruckert and Helmut Schwichtenberg for listening to 
talks or reading previous versions and giving stimulating advice. 

2 A A-Calculus with Generalized Applications 

We briefly recall the definition of the pure X-calculus A [1]: Terms are generated 
from variables x,y, z (of an infinite supply) by the grammar 

A 9 r, s ::= a; | rs | Xxr . 

X is bound in Xxr. Terms are identified on the syntax level if they only differ 
in the names of bound variables. The term rewrite relation is the compatible 
closure of the /3-reduction rule {Xxr)s r[a; := s], where r[x := s] denotes 
capture- free substitution of s for a: in r. 

The term grammar of A can be inductively characterized by 

Ab r,s,t ::= xs \ Xxr \ (Axr)ss , 

using the vector notation s for a possibly empty list of terms si, . . . , s„ of the 
grammar. /3-normal terms have the grammar 

r ::= xr \ Xxr . 

The calculus A3 is defined by generalizing the application rule of A: 
A3Br,s,t::=x \ r{s,z.t) \ Xxr . 

The variable x gets bound in Xxr as before, and 2 gets bound in z.t. Again, we 
consider terms only up to a-equivalence (variable convention) . It is clear how to 
define substitution of terms for variables avoiding the capture of bound variables 
by choosing their names appropriately. By definition, generalized applications 
associate to the left like traditional applications in A (which do so only by 
convention) . 

Embedding. AJ can be deflnitionally embedded into A by setting 

r(s, z.t) := t[z := rs] . 

This motivates the contraction rules of AJ, defined below: 

— (Axr)(s, z.t) translates to t[z := (Axr)s] which /3-reduces to t[z := r[x := sjj. 
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— r{s, z.t){s' , z' .t') and r{s, z.t{s' , z' .t')) translate to the same A-term 

t'[z' := t[z := rs]s'] = (t'[z' := ts'])[z := rs] . 

Therefore, we want to rewrite r{s, z.t){s' , z' .t') to r{s, z.t{s' , z' .t')) in AJ. 
This is clearly a permutative reduction: The outer generalized application is 
pulled into the inner one. 

Hence, the contractions of ylJ are 

(Axr)(s, z.t) i-^i3 t[z := r[x := s]] , 
r(s, z.t){s', z'.t') r{s, z.t{s', z' .t')) . 

The respective one-step reduction relations and are generated from these 
contractions by means of the compatible closure, defined for G by 

— If r I— > r' then r ^ r' . 

— If r — > r' then r(s, z.t) r'(s, z.t), s(r, z.t) s(r', z.t), s(t, z.r) s(t, z.r') 

and Xxr Xxr' . 

Clearly, the embedding of A into A,J by rs := r(s, z.z) preserves / 3 -reduction.^ 
Permutative reductions lead out of the range of this embedding. 

The reduction relation — > := coincides with the compatible closure 

of U We use to denote the reflexive, transitive closure of ^ G 
Since the relations ^ are generated by the compatible closure, 
they are compatible in the sense that s ^ s' implies r[x := s] r[x := s'] 
and r[x := s] ^ r[x := s'] if x occurs exactly once in r. As the underlying 
contraction rules are closed under substitution the reduction relations are also 
substitutive insofar as r ^ r' implies r[x := s] ^ r'[x := s]. Thus is parallel: 

r ^ r A s ^ s r[x := s] ^ r [a; := s ] . 

It will be useful to view r(s, z.t) as the application of r to the object (s, z.t) 
which we call a generalized argument,^ denoted by capital letters R, S, T, U. 

Permutative reductions act on generalized arguments: Let be the per- 

mutation of S into R = (s, z.t) defined by 

R{S} = (s,2.t){S'} := (s,z.tS) 

(we may assume that 2 does not occur free in S). Using this notation, permuta- 
tive contraction reads rRS rials'}. 

We will also need to consider multiple generalized arguments (written as 
R, S,T,U) to denote n > 0 many generalized arguments. If R represents 

^ This should be contrasted with the converse embedding, which neither preserves nor 
simulates reductions although it respects the equality generated by 
® In the light of the translation of TJ into A, (s, z.t) is not an argument to r but consists 
of an argument s and information z.t how to use the result of the application of r 
to s. 
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R\, . . . , Rn then rR represents the multiple generalized application rRi . . . R„ 
which can only be parenthesized as (. . . (rRi) ■ ■ ■ Rn)- 

With this notational machinery we arrive at the following inductive charac- 
terization of ylJ 



AJ B r,s,t,::= xR \ {Xxr)R . 

By splitting the two cases into subcases according to the length of the trail- 
ing R we can identify the subgrammars NF^,NF^,NF of normal terms w.r.t. 
^/3) — *■, i-e., terms that are not further reducible by the respective reduction 

relation. 



AJ 9 r,s,t ::= x 


1 x(s, z.t) 


1 Xxr 


xRSS 


\ (Xxr){s, z.t) \ (Xxr)(s, z.t)SS 


NF isB r,s,t ::= x 


x(s,z.t) 


1 Xxr 


xRSS 




NF.„. 5 r,s,t ::= x 


1 x(s,z.t) 


1 Xxr 




(Xxr){s,z.t) 


NF 9 r,s,t ::= x 


1 x(s,z.t) 


1 Xxr 







3 Confluence 



In contrast to A, AJ has critical pairs®: 

{Xxr){s, z.t)R \ >- {Xxr){s,z.tR) 

, 7T 



/3 






t[z := r[x := s]]i? ===^ {tR)[z '■= r[x := s]] 



rRST I rRS{T} 

I 7T 



7T 



7T 



rR{S}T I ► rR{S}{T} rR{S{T}} 

7T 7T 

Analogous to the treatment of A in Example 3.4 in [8], AJ can be modeled in simply- 
typed lambda calculus with a base type term and two constants abs : (term —> 
term) —> term and app : term ^ term —> (term —> term) —> term. The AJ term Xxr 
is represented by abs(Ax‘®™r*) : term and r(s,z.t) by app(r*, s*, A«‘“™t*) : term 
with r* , s* and t* the representations of r, s and t. With variables X, Y, Z of type 
term and F, G of type term term, 1-^/3 and are represented as 



app(abs(Ax‘“™(Fa:)), A; G(FX) and 



app(app(X, Y, Ax‘°™(Fa:)), Z, Ay‘““(Gy)) app(X, Y, Xx.app(Fx, Z, Xy(Gy))) . 

This is a higher-order left-linear pattern rewrite system. The associated rewrite re- 
lation represents that of AJ. 
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As argued in the introduction, we cannot use these joins to infer confluence of 
^ by standard techniques of higher-order rewriting. Instead, we prove conflu- 
ence of AJ by a modular extension of the proof for A in [10] which introduced 
the beautiful idea of proving confluence by showing a triangle property. First, 
permutative reductions are shown to be converging by a recursive definition of 
TT-normalization that yields a triangle property for Then the Takahashi 
method for A is adapted to prove the triangle property for /3-developments 
in A3. By establishing that and commute, we can glue triangles together 
with the help of Lemma 2 and obtain the triangle property for the sequential 
composition of and ^* . Note that the triangle property (“one-sided conflu- 
ence”) gives more information on the reduction relation than mere confluence 
(see Corollary 4) . 

Definition 1. Let — > and ^ be binary relations. 

• We write for the composition ^ o — 

• denotes the n-fold composition of —>■ with itself. 

• and ^ commute (written if Q 

• ^ has the triangle property^ w.r.t. a function f (written A f), if a ^ a' 

implies a' /(a). 

• ^ enjoys the diamond property ( written (}) if D — ^ • 

• ^ is confluent if 0. 

• is a ^-development if ^ ^ Q and ^ has the triangle property 

w.r.t. some function f which is then called its complete ^-development.® 

For a reflexive relation — > with — > A / we have Va G dom(— >).a ^ /(a)- 

Lemma 1. If ^ A / then 

{i) 0: 

(ii) a ^ a' /(a) ^ fW) (Simulation), 

{Hi) A /" (Triangle'^), 

(iv) is confluent. 



Proof, (i). Glue two triangles together. 

(ii) . a ^ a' implies a' f{a) by the triangle 
property, hence /(a) — > /(o') by the triangle 
property. 

(iii) . Induction on n. The base case is trivial. 

For the step case (figure on the right) assume 
a — 6 ^ c. Simulation yields f{a) — f{b), to 
which we may apply the induction hypothesis in 
order to get f{b) /"(/(a)). Since c ^ f{b) 
by the triangle property we get c /"+^(a). j 




b 



.fib) 




c 



^ The triangle property is the strongest confluence-related property considered in [12]. 
® Notice that this abstract notion of development is not the standard one. 
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(iv). We show 



ai 



02 



3 a! . ai 



02 



by induction on the maximum of m and n. With- 
out loss of generality m> n, say m = n + k. As- 
sume oi 'W a 02 a'2 (figure on the right), 

(iii) constructs a square of length n by providing 
an o' with oi o'”<— 02- If fc yf 0 yf n then 
the induction hypothesis gives an a” such that 
o' o""^ a'2. □ 




Corollary 1. If ^ is a -^-development then is confluent. 
Proof. ^ C ^ C entails and 0 by (iv). 



k 



a 



1 

2 



n 



k 



a 



II 



□ 



Lemma 2 (Triangle Composition). 

If A fi, ^2 A /2 and □ ^2 then 
^1^2 A /20 /i. 

Proof. Assume o — >1 oi —>2 02. By the triangle 
property for we obtain oi /i(o). Com- 
mutation yields an o' such that /i(o) ^2 o' 

02. By the triangle property for —12 we obtain 
o' ^2 /2(/i(o)). □ 



a ► ai 



hi- 



a] ► a 



/2(/l(o)) 




02 



3.1 Triangle Property of — >* 

is certainly weakly confluent, since its critical pair can be joined. Strong 
normalization therefore suffices to establish confluence of This could be 
achieved as in [3] by assigning a rough upper bound on the length of permutation 
sequences. Instead, we use a structurally recursive definition of the 7r-normal form 
to prove a triangle property for —>■* (hence confluence of ^.n-) and even obtain 
an exact bound on the height of the 7r-reduction tree as a byproduct. 

Definition 2. By recursion on terms r define the term r@R as follows: 

r@R ■= / 

1 rR else. 



Lemma 3. 

( 1 ) {r@{s, z.t))@R = r@{s, z.t@R), 

( 2 ) r,s,t G NFt^ r@(s, z.t) G NF^^, 

( 3 ) rR r@R. 



Proof. Each statement is shown by induction on r. 



□ 
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Definition 3 (Complete 7r-Development). 

:= X, 

{XxrY :=Xxr^, 

(r(s, z.t)Y := , z.tY- 



Lemma 4. 

(4) e NF^, 

(5) 

(6) r s . 

Proof. (4) and (5) by induction on r using (2) and (3). (6) by induction on 
r s using (1) in case of an outer permutation step. □ 

Corollary 2 (Triangle Property). A ()’^. Thus is confluent. 

Proof, r s implies = s’^ s. □ 

A closer look at the previous proofs reveals an exact bound on the height of the 
TT-reduction tree, yielding strong normalization of The constructive content 
of the proof of (3) yields 

m .= / 1 + R) + z.t@R)) if r = r'(s, z.t), 

^ ’ ' ' (0 else. 

and rR r@R. The constructive content of (5) gives 

flx:=0, #Axr:=#r, ffr{s, z.t) := ffr + ffs + fft + @{r^ , {s^ , z.^)) 

and r r'^ . (6) can be sharpened to r s ffr > ffs. This is again 

shown by induction on where for the case of an outer permutation step we 
first show by induction on r (using (1)) that 

@{r, (s, z.t)) + @(r@(s, z.t),R) > @(t, R) + @(r, (s, z.t@R)) . 

3.2 Triangle Property of /3-Development 

It is easy to apply Takahashi’s method [10] to in TJ. We spell out the (few) 
details in order to keep the presentation self-contained. 

Definition 4 (/3-Development). Inductively define the binary relation on 
terms and simultaneously use the abbreviation 

{s, z.t) {s' , z' .t') := s s' A z = z' At t' . 

(V) X =1/3 X. 

(C) Ifr^isr' then Xxr Xxr' . 

(E) If r r' and R^fj R' then rR z^fs r'R' . 
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(j3) Ifu v! , s =4/3 s' and t =4/3 t' then (Axu)(s, z.t) =4/3 t'[z := u'[x := s']]. 

Obviously, =4/3 is reflexive and therefore ^/3 C =4p. Also =4/3 C By (E) 
R =4f) R' and S =4/3 S' imply R{S} =4/3 R'{S'}. 

Lemma 5 (Parallelism). Ifr =4/3 r' and s =4/3 s' then r[x := s] =4/3 r'[x := s']. 

□ 

Lemma 6 (Inversion). If x ^/3 t then t = x. If Xxr =4/3 t then t = Xxr' with 
r ^/3 r' . □ 



Definition 5 (Complete /^-Development). 



x^ 

(Axr)^ 

{r{s,z.t))^ 



:= X, 



:= Xx r^, 

J := u^[x := s^]] 

(r^(s^,z.t^) 



if r = Xxu, 
otherwise. 



Lemma 7 (Triangle Property). =^/3 A 0^. 

Proof. Show r ^/3 r' r' by induction on The case (/3) requires 

parallelism (Lemma 5). The most interesting case is (E) where r{s,z.t) =4/3 
r'{s' , z.t') has been concluded from r =4/3 r', s =4/3 s' and t =4/3 t' . By induction 
hypothesis r' ^/3 r^ , s' ^/3 s^ and t' ^/3 t^. 

— If r = Xxro then = Ax Tq and by inversion r' is of the form Xxr'g with 
ro ^/3 Tq. r' = Xxx'q =4/} Xxr ^ = r^ is derived from r'^ ^/3 Tq . Using (/3) we 
obtain (Xxr'Q)(s' , z.t') =4/3 t^[z := Tq[x := s^]]. 

— Otherwise we simply apply (E). □ 

Using r ^/3 r we obtain as a corollary r =1/3 



3.3 Confiuence of — > 

We now establish that ^ := =4/^^* is a ^-development. Obviously, ^ C ^ 
C — >*. The triangle property will be shown in the next corollary. Remark that 
^ is even parallel, since =4/3 and —>■* are. 

Lemma 8 (Commutation). =4/3 A i-e., *^=1/3 C =4/3*^^- 

Proof. Induction on First note that it suffices to show the claim for one 
step in the assumption. 

Case (V). Trivial. 

Case (C). Simple application of the induction hypothesis. 

Case (E). The only interesting subcase is (E) facing an outer permutation 

rRlS”} rRS =4/3 r' S' with rR =4/3 r' and S =4/3 S' . 

We argue according to the rule used for deriving rR =4/} r' . 
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Subsubcase (E). Then r' is of the form TqR' and the parallel reduction has 
been concluded from r and R' . We get 



rR{S} =1^ r'^R'{S'}^^r'^R'S' 



using (E) twice. 

Subsubcase (/3). Here r is of the form Xxtq, R is (s, z.t) and the reductions 
read 



(Axro)(s, z.tS) (Axro)(s, z.t)S =4(3 t'[z := r'^lx := s']]S' . 

Using (E) and (/3) we obtain 

(Axro)(s, Z.tS) =4(3 t'[z := r'^lx := s']]S" . 

Case {(3). The situation is as follows: 

{\xu'){s" ^ z.t") 4 ^ (Axu)(s, z.t) =4(3 t'[z := u'\x := s^]] 

concluded from u ^(3 u' , s ^33 s' , t ^(3 t' and with exactly one permutative 
step in u u", s s" and t t" . Choose u'" , s'" ,t'" according to the 

induction hypothesis. Thus u'[x := s'] u'"[x := s'"] by parallelism of 

and, similarly, 

t'[z := u'[x := s']] t'"[z := u'"[x := s'"]] ^ (Axu")(s", z.t") . 

□ 



Corollary 3. ^ A 

Proof. Apply Lemma 2 to Lemma 7, Corollary 2 and the previous lemma. □ 



Corollary 4. Abbreviate r" := r 

(i) ^ is confluent, 

(a) r s s r'’ 



n times 
r- It-rT.-.p-rr 



Proof, (i). ^ is a ^-development by the previous corollary, (ii). Use simulation 
(Lemma 1 (ii)) and ^ C ^ C □ 



4 Standardization 

We want to show that if r — >* r' then r r', which expresses that there is some 
standard reduction sequence from r to r'. For the pure A-calculus is defined 

by 

(V) Ifr r' then xr xr' . 

(C) If r r' and s s' then (Axr)s (Axr')s'. 

(j3) If r[x := s]s t then (Axr)ss t. 
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Clearly, we can use a derivation of r r' to devise various reduction sequences 
from r to r' . Which reduction sequences are captured by this? (/?) allows to 
convert a head /?-redex and proceed. Application of rule (C) expresses that such 
a redex is kept and further standard reductions may only be performed in the 
constituent terms. Notice that no order is imposed on these subsequent reduc- 
tions. The same holds for the arguments of a term xr (rule (V)). As an example, 
a reduction step of the form {\xx){{\yr)s) — > (Aa;x)(r[j/ := s]) may occur in a 
standard reduction sequence and further standard reductions in r[y := s] may 
follow, but no outer /3-reduction of (Axx)r' with r' some descendant of r[y := s] 
is allowed any more. ® 

4.1 Standard Reductions 

Definition 6 (Standard Reduction). Inductively define the binary relation 
on terms and simultaneously use the following abbreviations: 

(s, z.t) (s', z' .t') := s s' f\ z = z' /\t ^ t' . 

R R! := R= Ri , . . . , Rn A R' = R'l , . . . , A Vz € {1, . . . , rz} : i?' . 

(V) If R'^ R' then xR xR' . 

(C) If r r' and S S' then {Xxr)S (Xxr')S'. 

((5) Ift[z := r[x := s]]S' v then {Xxr){s, z.t)S v. 

(tt) IfrR{S}S V then rRSS v. 

Given a term rS rule (tt) allows standard reduction strategies to start with 
various permutations of the trailing arguments S until either a possible head 
/3-redex is converted or reduction in the subterms is started by rules (V) and 
(C). Later use of permutative reductions affecting the outer structure (the term 
decomposition as displayed in the rules) of the term is forbidden, since the 
abbreviation R R' is defined by componentwise standard reduction. 

By choosing, say, left-to-right order of reductions in (V) and (C) we get 

Lemma 9. C ^*. □ 

4.2 The Standardization Theorem 

Our goal is to show that C The proof follows ideas in [2] (showing stan- 
dardization for A) . We profit from our inductive definition of which integrates 
permutative reductions so smoothly. 

Lemma 10. 

(1) is reflexive. 

(2) If r r' and S S' then rS r'S' . 

(3) If R R' and S S' then R'{S'}. 

® Formally, one can inductively characterize the reduction sequences that are captured 
by a derivation of r ^ s and call them standard reduction sequences. An infinite 
reduction sequence is standard, if all initial finite subsequences are. 




152 Felix Joachimski and Ralph Matthes 



Proof. (1). Obvious induction on terms, needing only rules (V) and (C). (2). By 
induction on r r'. Only the cases (/?) and (tt) need the induction hypothesis. 
(3). Uses (2). □ 

Lemma 11 (Parallelism). Ifr-^r' and s s' then r[x := s] r'[x := s'].^° 

Proof. By induction on r r' . The only subcase of interest is (V) with xR 

xBl , so that sil[a; := s] s'R'[x := s'] has to be shown. This is achieved by 
the induction hypothesis and repeated applications of (2). □ 

Lemma 12. If r r' ^ r" then r r". 

Proof. Induction on r r'. 

Case (V). Let xR xR' r" thanks to R R! . Either r" is reached by 

reducing in one of the R! or by a permutative reduction between the R' . 

— In the first case r" = xR” with R" being derived from R! via reduction 
of a single term in JR'. By induction hypothesis R R!' , hence xR 
xR!' = r". 

- In the second case R = SSTT, R! = S'S'T'T and r" = xS'S'{T'}T'. 
By Lemma 10 (3) S{T} S'{T'}. Therefore, xSS{T}T xS'S'{T'}T' 
and by rule (tt) xR = xSSTT xS'S'{T'}T' = r". 

Case (C). Let {Xxr)S {Xxr')S' v thanks to r r' and S' S". We 
have to show {Xxr)S v. Distinguish cases according to the rewrite step 
(Xxr')S' v: 

— If it is a reduction inside r' or one of S' then the induction hypothesis 
applies. 

— If it is an outer /3-reduction then S = {s,z.t)T, S' = {s',z.t')T' and 
V = t'[z := r'[x := s'jjT'with s s', t t' and T T'. By parallelism 

r[x := s] r'[x := s'] and t[z := r[x := s]] t'[z := r'[x := s']] . 

Repeated applications of Lemma 10 (2) yield t[z := r[x := s]]T v, 
hence {Xxr){s, z.t)T v. 

— The remaining case is a permutative contraction: S = SqSTT, S' = 
S'qS'T'T' and v = (Xxr')S'QS' {T'}T' . This is handled like the second 
case of (V) above. 

Cases (/3) and (tt). Trivial application of the induction hypothesis. □ 

Corollary 5 (Standardization). If r r' then r r'. 

Proof. Induction on the number of rewrite steps in r r'. □ 

By Lemma 9 and the previous corollary and are (extensionally) the same 
relation. The virtue of the standardization theorem is that it provides a new 
inductive characterization of — >*, namely the inductive definition of This 
will be used in the next subsection. 

Note that the lemma is a trivial consequence of ^ but is necessary to derive 

it. In contrast to that, parallelism of =//3 (Lemma 5) is an additional feature for 
non-trivial developments. 
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4.3 Inductive Characterization of the Weakly Normalizing Terms 

As a typical application of standardization, we characterize the set of weakly 
normalizing terms 



wn := {r \ 3s. r s G NF} 

of ylJ by a syntax-directed inductive definition, incorporating a specific standard 
reduction strategy. 

Definition 7. Inductively define the set WN by 

(V) X € WN, and if s € WN and t G WN then x{s,z.t) G WN. 

(C) IfrG WN then \xr G WN. 

(f3) Ift[z := r[x := s]]S' G WN then {\xr){s, z.t)S G WN. 

(tt)~ If xR{S}S G wn then xRSS G WN. 

This definition is syntax-directed: A candidate for WN can only enter via the 
single rule pertaining to its form according to our inductive term characteriza- 
tion. 

The reduction strategy underlying WN is: Perform head /3-reductions as long 
as possible. Then reduce below leading As. If a variable appears in the head 
position of the term, permute all the generalized arguments into the first one 
(from left to right), and then continue with the two terms in the remaining 
generalized argument in parallel. WN defines those terms on which this process 
succeeds. In this situation parallel and sequential composition of reductions are 
equivalent. 

Our aim is to prove that WN = wn. This is done by help of a restricted 
notion of standard reduction which describes the graph of the (partial) 
normalization function. 

Definition 8. Inductively define the binary relation on terms: 

(V) X X, and if s s' and t t' then x{s,z.t) x{s' ,z.t'). 

(C) If r r' then Xxr Xxr' . 

(P) Ift[z := r[x := s]]S' v then {Xxr){s, z.t)S v. 

(tt)~ If xR{S}S V then xRSS v. 

Comparison with shows that (V) deals with multiple generalized arguments 
of length 0 and 1, only. (C) lost its trailing S and permutations (rule (tt)”) are 
only allowed between the first two generalized arguments of variables. 

Lemma 13. C WN x NF. □ 

Lemma 14. If r G WN then there is a term r' with r □ 

As a corollary, is even a function from WN to NF, since the definition of is 
syntax-directed in the left argument. 
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In order to establish the main lemma of this subsection, we need to show that 
the restricted version (tt)” in the definition of is sufficient to derive the full 
rule (tt). 

Proposition 1. If rR{S}S v then rRSS v. 

Proof. Course-of-generation induction on (allowing to recourse on the whole 
derivation). We distinguish cases according to the forms of r in our inductive 
characterization of terms. 

Case X . This is included in (tt)”. 

Case x{s,z.t). S has the form , Sn- x{s, z.t)R{S}S v has been 

derived from x{s, z.tR{S})S v. This in turn has been derived from 
x(s, 2.ti?{5'}5'i)5'2 . . .Sn V . Repeating this argument, we find that the 
relation has been derived from x{s, z.tR{S}S) v which must have been 
inferred by {V) from v = x{s' ,z.v') with s s' and tR{S}S v'. By 
induction hypothesis tRSS v' . Consequently, x{s, z.tRSS) v. Rule 
(tt)” applied n + 2 times then yields x{s, z.t)RSS v. 

Case xTUU. xTUUR{S}S v comes from xT{U}UR{S}S v. By 
induction hypothesis xT{U}URSS v. Hence, xTUURSS v by 
rule (tt)”. 

Case XxrQ. R has the form (s,z.t). {Xxro){s, z.tS)S v has been derived 
from := ro[x := s]]S' v. We are done by rule {(3) since by the 

variable convention 

(t5')[z := ro[x := s]]S' = t[z := ro[x := sJJS'S' . 

Case {Xxro){s, z.t)R. Rule (/3) has been applied, so the induction hypothesis 
and rule {(3) suffice to prove this case. □ 



Lemma 15. If r r' and r' G NF then r r'.^^ 

Proof. Induction on 

Case (V). xR' is in NF if R' is empty or a single generalized argument 
{s',z.t') with s',t' € NF. So were are done by rule (V) and the induction 
hypothesis. 

Case (C). (Xxr')S' is normal if S' is empty and r' is normal. Now apply (C) 
to the induction hypothesis. 

Case {(3). By induction hypothesis (the same rule is used in and 
Case (tt). We have rRSS v G NF thanks to rR{S}S v. By induction 
hypothesis rR{S}S v. The proposition applies. □ 

Corollary 6. WN = wn. 

As a consequence, is indeed the normalization function, defined on wn. 

In an analogous treatment of the pure A-calculus any derivation of r ^ r' with 
r' G NF is already a derivation of r r'. In TJ, permutations require remodeling 
of the derivation. 
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Proof. WN C wn: Obvious induction on WN. wn C WN: Let r r' G NF. By 
standardization r r'. By the preceding lemma r r', hence by Lemma 13 
r G WN. □ 

Notice that the proof of wn C WN essentially uses the inductive characterization 
of — >* by the definition of as the main fruit of the standardization theorem. A 
simple induction on the length of reduction sequences fails to prove wn C WN. 
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Abstract. Linear Second-Order Unification and Context Unification 
are closely related problems. However, their equivalence was never for- 
mally proved. Context unification is a restriction of linear second-order 
unification. Here we prove that hnear second-order unihcation can be 
reduced to context unihcation with tree-regular constraints. 

Decidabihty of context unihcation is stiH an open question. We comment 
on the possibihty that hnear second-order unihcation is decidable, if con- 
text unihcation is, and how to get rid of the tree-regular constraints. This 
is done by reducing rank-bound tree-regular constraints to word-regular 
constraints. 



1 Introduction 

Context Unification (CU) [10, 11] is an extension of First-Order Unification 
where, in addition to the first-order variables, we also have variables that de- 
note contexts. These context variables are applied to arguments, thus the term 
F{t) denotes any term containing t as a subterm, and F denotes the context 
surrounding such subterm t. Linear Second- Order Unification (LSOU) [3, 5] is 
a restriction of unification in Second-Order Simply Typed A-Calculus, where 
only linear terms are considered as possible instances of second-order variables. 
A linear term is a A-term where the most external A-bindings bound one and 
just one occurrence of the variable. CU is a restriction of LSOU, therefore both 
problems are between the decidable first-order unification problem and the un- 
decidable second-order one. The common assumption is that CU is decidable. 
This is because various restrictions of this problem [1, 3, 13, 14, 15] make it 
decidable, while the same restrictions applied to second-order do not [4, 6]. It 
is also known that, like for the word theory, the context theory is undecidable 
beyond this existential fragment [9]. The natural question to ask is whether, if 
CU is decidable, then LSOU will be. This is the main topic of the present paper. 

* The first author is partially supported by the project MODELOGOS founded by 
the CICYT. 
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CU is a restriction of LSOU where 1) third- or higher-order constants are 
not allowed, 2) second-order variables are unary, and 3) there are no internal A- 
bindings, and external ones are only used to denote the parameter of a second- 
order variable. The common belief was that third- or higher-order constants 
do not play an important role w.r.t. the decidability of both problems, neither 
the use of A-bindings. The restriction of using unary context variables is not a 
real restriction because we can replace binary (similarly for n-ary) variables like 
F{ti,t' 2 ) by Tb(/(Ti(fi), ^2(^2))) introducing a conjectured constant symbol / 
(see Subsection 3.2 and [12]). However, the equivalence of both problems was 
never formally proved. 

The naive attempt to reduce LSOU to CU by replacing bound variables by 
new constant symbols does not work. This is because we have to ensure that 
substitutions avoid variable capture. For instance, the following LSOU problem 

Xx.f{x) -uou Xx.f{Y) 

is not solvable. The substitution cr — \Y ep- x] gives us: 

\x.f{x^ — hou Xy.fix^ 

but both terms are not A-equi valent, because an a-conversion is needed in order 
to avoid the capture of variable x. However, applying the naive reduction to this 
problem we get the following solvable CU problem: 

/(c,) f{Y) 

We can try to apply a more sophisticated reduction. Take the original LSOU 
problem and substitute the bound variables by two distinct constants. However, 
this method only works for the most external A-bindings. Applying the reduction 
to the following solvable LSOU problem with internal A-bindings: 

f{g{\x.x),a) -isou f{Y,Z) 

we get the following unsol vable CU problem: 

f{g{c,),a) f{Y,Z) 

f{g{c'J,a) f{Y,Z) 

Bindings can transform free variables into bound variables at different depths. 
Somehow we have to ensure that if an instance of a (free) variable contains a 
bound variable, then it also contains its corresponding A-binding. For instance, 
given the LSOU problem F{X) —isou g{Xy.y, a) and the following substitutions: 

(Ti-[Xe^a, F Xz.g{Xy.y,z)] 

(T 2 - [Xe^y, F Xz.g{Xy.z,a) ] 

(73 -[X g{Xy.y, a), Fep-Xz.z ] 

only (7i and (73 are unifiers. As we will show, such a restriction can be ensured 
by means of tree automata [2] , but it does not seem easy to be simply encoded 
in terms of context equations. 
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On the other hand, context unihcation and word unification are also closely 
related problems. Word unihcation is decidable [8], and can be enriched with 
regular restrictions without loosing decidability [16]. Tree-regular languages are 
to terms as regular languages to words. Therefore, if context unihcation turns out 
to be decidable, then it seems reasonable to think that context unihcation could 
be enriched with tree-regular restrictions without loosing decidability. To support 
this hypothesis, we would like to prove that membership equations on tree- 
regular languages can be reduced to membership equations on (word) regular 
languages, by encoding terms as sequences of symbols (the traversal sequence). 
Unfortunately, we can only prove this reduction for a certain subset of tree- 
regular languages (what we call rank-bound tree-regular languages) . 

There is a proof that, if context most general unihers are rank-bound, then 
CU is decidable [7j. If most general context unihers were proved to be rank- 
bound, then tree-regular restrictions would also be rank-bound, and we would 
also have the decidability of LSOU. We comment on this possibility at the be- 
ginning of Section 4. 

This paper proceeds as follows. After introducing some basic dehnitions in 
Section 2, we reduce LSOU to CU with tree-regular constraints in Section 3. In 
Section 4 we reduce rank-bound tree-regular restrictions to word-regular restric- 
tions. Finally, in Section 5 we discuss whether this results could be extended to 
linear third- or higher-order unihcation. 

2 Preliminaries 

Let 17 be a simply typed signature where hrst-order constants are denoted by 
a,b , . . ., and higher-order constants by /, g,h , . . .. Let X be an enumerable set of 
simply typed variables where hrst-order variables are denoted by capital letters 
A, y, y, ..., second-order variables by T, G, H , ... and bound variables by lower- 
case letters x,y,z, .... Types and their orders are dehned as usual in the simply 
typed A-calculus. For simplicity, we can assume that there is only one base type. 
Other types are built as ti — T 2 where ti and T 2 are types. A term t is said 
to have arity n if f : ti • - Tn tq where tq is a base type. Second-order 
terms are also standard: terms constructed using constants and bound variables 
of any order, but free variables of order at most two. Normal terms (/?-reduced, 
//-expanded terms) have the form Xx.fifii, ...fin) where a; is a list of bound 
variables, / is either a bound, a free variable or a constant, and ti,...fin are 
normal terms. If U : ti, : r„ then / : ti — r„ — tq where tq is a base 
type; and, iixi : t[, ...,Xm : then Xx.f(ti, ...fin) : r[ ... ^ tq. Notice 

that, as we are in second-order, if / is a free second-order variable then ri, ..., r„ 
are base types. A term Xx.fifii, ...fin) is said to be linear if, written in normal 
form, any bound variable X{ £ x occurs once and just once in f{t\, ...fin) - Notice 
that ti, ...fin are not required to be linear. 

A linear second-order unihcation (LSOU) problem is a pair^ t —isou u of 
terms (not necessarily linear). A solution cr of a LSOU problem is a second- 

^ Notice that a set of equations is equivalent to just one equation. 
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order substitution such that (j(t) and cr{u) are A-equivalents, and cr{X) is a 
linear term for any free variable X . 

A context unification (CU) problem is a pair t —cu u of terms that does 
not contain A-bindings neither constants of order higher than two, and where 
second-order variables are unary. A solution cr of a CU problem t —cu w is a 
second-order substitution such that (j(f) = and cr{X) is a linear term that 

does not contain n-ary (n > 1) variables, for any free variable X . 

A CU problem with tree-regular constraints is a CU problem t —cu u with 
a tree-regular constraint v A? X solution is a ground substitution cr solving 
the CU problem, and satisfying cr{y) £ L{A). 

For simplicity we assume that the signature of the problem allows us to 
ensure the existence of a ground solution whenever a solution exists. Notice that 
this fact can always be ensured if we extend the signature X ensuring that it 
contains at least a constant a : tq for any base type tq and a binary function 
/ : Ti — 1 T 2 — 1 To for any base types tq, ti and T 2 . 

3 From Linear Second-Order Unification to Context 
Unification with Tree-Regular Constraints 

In this section, we prove that LSOU can be reduced to CU plus tree-regular 
constraints. This reduction is done in two steps. In subsection 3.1 we reduce the 
LSOU problem to the n-ary context unification problem by removing A-bindings 
and constants with order higher than two. We obtain a context unification prob- 
lem with n-ary contexts, i.e., second-order variables of arity n, plus tree-regular 
constraints. In subsection 3.2 we translate n-ary contexts to (1-ary) contexts. 



3.1 Reducing Linear Second-Order Unification to n-ary Context 
Unification plus Tree-Regular Constraints 

The translation from LSOU to n-ary CU has to remove A-bindings from terms. 
Bound variables will be replaced by new constants. In second-order A-calculus, A- 
bindings of normal terms are always just below higher-order constants or bound 
variables, or are the most external symbol. They are never just below free vari- 
ables. We can eliminate external A-bindings by extending the signature X with 
an appropriate new unary constant o (if it does not contain any one) and translat- 
ing the equation \x.s —isou Ay.f into o(Aa;.s) —isou o{\y.t). This new problem 
does not have external A-bindings and is equivalent to the original one. 

The elimination of internal A-bindings is performed in three steps: 

First, we conjecture an a-conversion of bound variables in order to allow unifi- 
cation when they are later translated into constants in the following step. Notice 
that the second step of this translation procedure depends on the “names” of 
these bound variables. 

^ Notice that a set of tree-regular constraints is equivalent to just one constraint. 
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Second, let i? C -L be a finite set of variables and let A C Cistsof [B) be a finite 
set of lists of variables from B. We define a translation function trans^’^ that 
replaces any occurrence of a variable of i? by a new first-order constant, and 
any occurrence of a A-abstraction, whose list of bound variables is in A, also by 
a new constant. This set B will be the set of bound variables of the unihcation 
problem resulting from step 1 , and A will be the set of lists of variables used in 
the A-abstractions. 

The signature S' of the resulting n-ary CU problem also depends on the set 
B of bound variables conjectured in the previous step, and on the set A of lists 
of bound variables of the A-abstractions. It is dehned as follows: S' contains 
the same constants as S, but every constant h or bound variable z, with order 
higher than two, is replaced in S' by a new second-order constant h' or c^, 
respectively. The arity of h' (similarly for Cz) is equal to the arity of h plus its 
number of non-hrst-order arguments. Any non-hrst-order n-ary argument of h 
with type ri — ?- • • • — ?- r„ — tq is replaced by two hrst-order arguments, one 
with a new special type o, and the other with the base type tq. For instance, if 
/i : Ti — 7- (t 2 — 7- Ts) —7- T 4 then : ri — o — rs — T 4 . The signature S' also 
contains a new constant symbol of type o, for every list [x \, ..., x„\ G A, 

and a new constant symbol Cj,, for every variable x (E B. The set of variables of 
the resulting problem is X' — X\B. 

Let t G T{S, X) be a term, i? C -T a set of variables, and A C Cistsof (B) a 
set of lists of variables from B. The term trans^’^ {t) G T{S' , X') is dehned by: 



trans^’^ [c) — c 



trans^’^{f{ti, ...,f„)) = f{trans^’^{ti ), ..., trans^’^ {tn)) 



trans^’^ 




X AX (fB 
ex A XceB 



trans^’^{F{ti, ...,f„)) 



F{trans^’^ {ti ), ..., trans^’^ {tn}) A F ^ B 
CF{tran,s^’^ (ti), ..., trans^’^ {tn)) A F ^ B 



tran.s^’^ A»i.wi, ..., Xxm-Um}} — ■■■ 

— h'ftmnsA’^ {ti ), ..., tmns^’^{tn),h^,^, tmns^’^{ui), trans^’^ {um)) 

tran.s^’^{z{ti, A»i.Wi, ..., XXm-Um}} — ... 

= Cz{tran.s^’^{ti), ..., tran.s'^’^ tmns^’^ [ui] , tmns^’^ {um)) 



trans^’^ {Xx .t) — Xx .trans^' {t) 



In the hfth and sixth case, for constants h and variables with order higher 
than two, we assume for simplicity that non-hrst-order parameters are in the 
last positions. The constant h' is the second-order constant associated to h, Cz 
is the constant associated to the variable z, and b^^i is the constant associated to 
the list of variables G A of the A-binding Xxi. If, for some i G [l..m], Xi ^ A, 
then the translation is undehned. In the last case. A' is the set of lists A where 
any list containing variables from x has been removed. Notice that most external 
A-bindings are not removed by this translation. 
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Third, we introduce a set of tree-regular restrictions over the instantiations of 
variables to prevent them from containing constants associated to bound vari- 
ables from B without its corresponding A-bindings. 

The tree automata — {B, Q,Qf,A) that characterizes the set of terms 
that do not contain these bound variables from B in free positions is defined as 
follows. 

— The signature contains the set of constants B' . Remember that B' allows us 
to ensure that, if a certain CU problem S is solvable then, S has a ground 
solution. 

— The set of states is Q = {qx\X C i?} U {px\X £ A}; where B is the set of 
bound variables and A is the set of A-bindings. 

— There is a single final state Qf — {qq>\ 

— The set of transitions A is defined as follows: 

♦ For any first-order constant a B' not associated to a variable from B : 

a ^ qq, 

♦ For any first-order constant Cj, associated to a bound variable x ^ B and 
any first-order constant by associated to a list of bound variables y ^ A: 

C3; q{x} 

Py 

♦ For any second-order constant f (E B' not associated to a variable from B 
and, for any constant Cj, associated to a bound variable x (E B, and states 
qA\ } ’ ’ ’ } qAn ’ 

/(?Ai , •••, ?A„) — t ?AiU...UA„ 

Ca:(?Ai, •••, ?A„) — t ?{a:}uAiU...UA„ 

♦ For any second-order constant h' £ B' associated to a higher-order con- 
stant h (E B and, for any second-order constant £ B' associated to a 
higher-order bound variable z EE B: 

h'iqAi, ■■■qAr,,PB^,qc\, ...,PB,^,qcJ -t gz> 
Cz{qAi,—qA„,PBi,qc\, ■■■,PB„,ACrr,) qs 



where 

^ — Uie[i„n] A- u \ Bj) 

E —{z]\J Ai U \ 

Notice that the Life’s are treated in the transitions as sets but they denote 
lists: Xx, y is not the same A-abstraction as At/, x, so they have distinct 
associated constants but here are treated as the same set. 

Then, we introduce a set of tree-regular restrictions over the solution cr of 
the translated problem. 




162 



Jordi Levy and Mateu VilLaret 



— For any first-order variable X, the restriction cr{X) £ 

— For any second-order variable F, the restriction cr{F{ai, a„)) £ 
where at £ S' are hrst-order constants of the appropriate types. 

Example L Given the problem f{X,X) —isou f F{9{^y ■!/}}} we 

can conjecture the following a-equi valent problem (this is the only solvable one) 
f{X,X) —isou f{ 9 {^^-F{x)),F{g{\x.x))) and translate it into the following 
context unihcation problem with tree-regular constraints 

f{X,X) /(/(&[.], F(c,)),F(/(6[,],c,))) 
a{X) £ 

(j(F(a)) £ 

where the tree automata ig dehned by 

g0 6[j:] —1 P{j:} 

/(?A; 1b) Ia^JB Cx 1{x} 

9'{PA,qB) -t 1B\A 

In the following lemmas we assume that all bound variables are in B and 
bindings in A. 

Lemma 1. For any second- order substitution cr satisfying cr{X) does not con- 
tain variables of B in free positions, and the domain of cr neither contains vari- 
ables of B, let T — trans^F ihg context substitution defined by r{X) — 

trans^F (^ut^X)) . Then, for any term t we have 

trans-'^F — r{trans'^F (f'j'j 

Lemma 2. For any second-order term t the set of free variables oft and B are 
disjoint if and only if trans^F (z L[A''^F f 

Theorem 1. A LSOU problem s —isou I is unifiable if and only if there exists 
an a-equivalent unification problem s' —isou I' such that 

trans^F —cu trans'^F (f'~j 

and the corresponding tree-regular constraints are solvable. Ftere, B is the set of 
bound variables of s' and t' , and A is the set of lists of bound variables corre- 
sponding to X-abstractions of s' and t' . 

Corollary 1. Linear second-order unification is reducible to n-ary context uni- 
fication plus tree-regular constraints. 



3.2 Reducing n-ary Context Unification to (1-ary) Context 
Unification 

In this subsection we reduce the n-ary CU problem to the (1-ary) CU problem. 
The same main ideas are also used in other previous papers, like [12]. Given an 
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n-ary context unification problem S over a signature S, if U does not contain an 
n-ary constant with n > 2 and a first-order constant, we enlarge it with them. 
We construct a new context unification problem by iteratively applying the 
following rule, until all non-unary context variables of the problem disappear. 

For any n-ary context variable F with n > 2, we guess a p-ary constant 
symbol p, with p >2, from the signature. Then, we guess a partition of {1, . . . , n} 
into p < n many disjoint subsets such that lJie[i • • • ’ ~ • • • ’ 

and at least two of them are non-empty. We instantiate F by the following 
substitution: 

[F 1-1 • ■■Xxn.Fo{g{Fi{x^i,. ..,x^ij , . ..,Fp{x^p , . ..,x^p^)))] 

where Fq, . . ,,Fp are (maybe non-unary) context or first-order variables. 
Example 2. Consider the following n-ary context unification problem: 

X[Y[a,h)) Y[X[a),h) 

where one of its infinitely many minimal solutions is (see Figure 1) 

(7 = [ X I— 1 \x.Z{Z{x, b), b), 

Y \x,y.Z{Z{Z{Z{x,b),y),b),b)] 

where Z is a fresh context variable. We enlarge our signature to S' — {a,b,g}, 
where p is a new binary constant. Now, we can guess a partition of {1, 2} into 
two disjoint subsets {1} and {2}, where both are non-empty, and instantiate Y 
by: 

T-[Y Xxi,X 2 .Yo{g{Yi{xi),Y 2 {x 2 )))] 

We obtain a new problem (see Figure 2): 

X{Yo{g{Yi{a),Y2m) Yo{g{Yi{X{a)),Y2m (3) 

which is also solvable, and only contains (unary) context variables. 



( 1 ) 

( 2 ) 



Theorem 2. n-ary context unification is NP-reducible to (1-ary) context unifi- 
cation. 

4 Translating Tree-Regular Constraints to Regular 
Constraints over Traversal Sequences 

The decidability of context unification with tree-regular constraints, as well as 
the decidability of context unification, are still open problems. There is a proof 
that, if most general context unifiers are rank-bound, then CU is decidable [7]. 
However, it is not known if most general context unifiers are in general rank- 
bound. In this section we will show that, if this is the case, then the decidabil- 
ity proof of context unification could be extended to context unification with 
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X(Y(a,b)) Y(X(a),b) 





Fig. 1. A solution of the LSOU problem (1) 



tree-regular constraints. Therefore, linear second-order unification would also be 
decidable. 

This mentioned proof is based on a reduction of context unification to word 
unification with regular constraints [16], where terms are translated into se- 
quences of symbols (traversal sequences). In the following we present the main 
ideas of this reduction. 

Definition 1. Given a signature (17, X), we define the extended signature 
= {/"I / e ^ U T A P e 7Ianty(/)} 
where is the group of permutations over n elements. 

A sequence s £ [U^)* is said to be a traversal sequence of a term t, noted 
s £ trav(f), if: 

1. s — t when t — c is a D-ary symbol 

2. s — r^Sp(i') ■ ■ ■Sp(n') when t — f{ti, . . .fin) being S{ traversal sequenees of t{ 
for any i £ [l..n], and p £ a permutation. 

Any traversal sequence of a term characterizes this term. We use an extended 
signature with permutations in order to allow us the use of distinct traversals, 
i.e. the traversals of subterms in distinct possible orders. 

Definition 2. The rank of a term, rank(f), is defined by rank(a) = 0, for any 
eonstant a, and rank(/(fi, ...fin)) — c where c is the minimum integer satisfying: 
there exists a permutation t of indiees l,...,n sueh that, for any i £ [l..n], 
rank(fr,) < c — n + i. 
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X(Y0(g(Yl (a),Y2(b)))) YO ( g ( Y1 ( X ( a ) ), Y2 ( b ) ) ) 




Fig. 2. A reduction of the LSOU problem (1) to context unification 



This definition is bizarre, but it can be simplified for binary trees as follows:® 



rank(a) = 0 

^ r w _ / rank(fi) -|- 1 

rank(/(fi,f2)) - | max{rank(fi), rank(f2)} 



if rank(fi) = rank(f2) 
otherwise 



The rank of a term allows us to define a normal traversal. 



Definition 3. Given a term t, its normal traversal sequence NF(f) is defined 
reeursively as follows: 

1. Ift — a then NF(f) = a. 

2. Ift — f{ti, . . .fin) then let be p ^ the permutation satisfying 

i <j ^ (rank(f,,(i)) < rank(f,,(j)) 

Vrank(f ,,(,•)) = rank(f,,(j)) A p{i) < p{j)) 

Then,W{t)-fP NF(t,,(i)) ••• NF(f,,(„)). 

Notice that a restriction on the rank of a tree does not imply a restriction 
on its size. The following conjecture states that the rank of any most general 
context unifier is bound: 



Conjeeture 1. For any solvable context unification problem t — u and m.g.u. a, 
we have 

rank((j(f)) < </>(size(f = u)) 
where is a computable function. 

® Alternatively, we can also define the rank of a binary tree t as the depth of the 
maximum complete tree t' (a tree where all leaves are at the same depth) such that 
there exist an injective morphism from t' to t. 
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Traversal sequences of rank-bound terms and regular language are related. 
For any signature S and bound n, there exists a regular language such that, 
for any n-rank-bound term t there exists a traversal sequence w £ trav(f) with 
w £ We can restrict the choice of the traversal sequence to the traversal 
normal form. Thus, the set of traversal normal forms of rank-bound terms is a 
regular language. For instance, for a signature with a constant a and a binary 
function /, any term satisfying rank(f) < 1 has a traversal belonging to: 

I /Pdl) a 

and those satisfying rank(f) < 2 have a traversal belonging to: 

R? = I R^y a 

This allows us to reduce any CU problem, like X{f{Y (a), Z{b))) —cu 
Y [f [X [a) , Z ib))) to some word unihcation problem plus traversal equations, 
like: 

Xo To aYiZoh Zi Xi Y^ /[b2] x'^ aX[Z'^h Z[ Y{ 

XocXi=X'oC X[ 

To c Ti = Yi c Y{ 

Zo c Zi = Zq c Z'l 

where the words Xq and X\ encode a traversal sequence of the context X , and 
Xq and another traversal of X. The intended meaning of w\ = W2 is: w\ 
and W2 are similar traversal sequences of the same term. By similar we mean 
that we can bound the number of permutations in which w\, W2 and NF(f) 
differ, where wi,W2 £ trav(f). If the rank of this term is bound, then we can 
non-deterministically reduce these traversal equations to word equations plus 
regular restrictions like 

Xo c Xi X(, cX[ XocXie 
Yo c Yi Y' c Y{ YocYie 

Zo c Z'a cZ[ ZocZi^ 

The restriction Xo c Xi ^ ensures that the instances we hnd for these 

words are really traversal sequences. 

In what follows we show how membership equations on tree-regular languages 
of rank-bound terms can be reduced to membership equations on (word) regular 
languages. We start by dehning rank-bound tree automata. 

Definition 4. For any tree automata A — {X, Q,Qf, A), and any state q{ £ Q, 
we define^ 

rank(gi) = max{rank(f) |f £ L{{X, Q, {gj, Zl))} 

For any tree automata A — {X, Q,Qf, A), we define 

rank(^) = max{rank(gi) | g*- £ Q/} 

A tree automata A is said to be bound */rank(^) < oo. 



^ Notice that {X, Q, {qi}, A) is similar to A but with a unique final state qi. 
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Notice that the rank of the states of a tree automata satisfies the following 
property 

— for any state q having only transitions like c ^ q, where c £ 17 is a 0-ary 
constant, rank(g) = 0, 

— for any accessible state go having transitions like /(gi, q'z) — t go, where / G 17 
is a binary function, we have: 



rank{qo) > 



max {rank (qi), rank {q 2 }} if rank{qi) rank{q 2 ) 

rank{qi) + 1 if rank{qi) — rank{q 2 ) 



We translate tree-regular restrictions to (word) regular restrictions over 
traversal sequences of terms. For simplicity assume that any symbol of the sig- 
nature is, at most, binary. The result can easily be extended to any signature. 

For any rank-bound tree automata A — (17, Q,Qf, A), and any state q (z Q, 
we define a regular language Rq satisfying Rq fl trav(f) zjz. 0, for any term t £ 
Q, {?}, ^)), and Rq C (J{trav(f) | t £ L((i7, Q, {g}, Zl))}. 

We will construct the automata that recognizes Rq using the following rules. 
Assume that Rqt is already computed for any state q' (z Q with rank(g^) < 
rank(g). Let be n = rank(g). The automata Rq has a pair of states p® and 
for any state p of the tree automata satisfying rank(p) = n, and some additional 
states that we will specify later. The initial state of Rq is g® , and there is a single 
final state and it is g-^ . The set of transitions of Rq is defined as follows. 

— Base case, for any state p (z Q satisfying rank(p) = n, and any transition 
a ^ p ^ A we add a transition: 

a 




from p® to p^ labeled with a. 

— Inductive case 1, for any state po with rank(po) = n and any transition 
/(pi,P 2 ) — 7- Po satisfying rank(p 2 ) < rank(pi) < rank(po) ® 




we can assume that Rp^ is already computed. We add a copy of the automata 
i?P 2 , i-6- ^ copy of all its states and transitions (these are the unspecified 
additional states). We also add a transition from p®o to the initial state of 
the copy Rp^ labeled with a A-transition from the final state of Rp^ to 

Pi , and another from p{ to pi . 

For any transition /(pi,P 2 ) — t po satisfying rank(pi) < rank(p 2 ) < rank(po) 
we do something similar using the label 

® Notice that if rank(pi) = rank(p2) = rank(po) then rank(po) = oo and the tree 
automata would be non-rank-bound. Thus the existence of a bound n for the rank 
of the tree automata is crucial in our translation. 
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— Inductive case 2, for any state po with rank(po) = n and any transition 
f{pi,P 2 ) Po satisfying rank(pi) < rank(po) and rank(p 2 ) < rank(po) 

J1.2] 




we can assume that Rp^ and Rp^^ have been already computed. We add a 
copy of each one of these automata, a transition from p^ to the initial state 
of Rpj^ labeled with a A-transition from the hnal state of Rp^^ to the 

initial state of Rp^ , and another from the hnal state of Rp^ to pi . 

Notice that with these cases, transitions like /(g, q) ^ q are not considered, 
because this means that rank(g) = oo, and g can not lead to a hnal state. The 
hnal automata associated to A consists of an initial state go, a copy of Rq for 
any hnal state q ^ Qf, a. A-transitions from go to each one of the initial states 
of the Rq’s. The set of hnal states is the set of hnal states of the Rq’s. 

Example 3. The tree automata dehned by the following transitions 
0 -t gw, pair{qN, qN) -t qp, nil -1 qp, 

«(gw) -t gw, cons{qp,qp) -1 qp 

is translated into the following regular automata: 




The term cons{pair{suc{suc{0)),suc{0)),cons{pair{suc{0),0),nil)) recognized 
by the tree automata, has a traversal sequence 

cons^^’^^ paiA^’^^ sue sue 0 sue 0 cons^^’^^ paiA^’^^ sue 0 0 nil 

recognized by the regular automata. 

Theorem 3. For any tree-regular language L{A) of a rank-bound tree automata 
A, let L{B) be the regular language recognized by the automata B resulting from 
applying the previous translation. The following properties hold. 
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1. Ift^ L{A) then there exist a sequenee I £ L{B) sueh that I £ transit). 

2. If I (z L{B) then there exist a term t £ L[A) sueh that I £ transit). 

The set of terms satisfying rank(f) < n defines a tree-regular language. More- 
over, this language can be recognized by a rank-bound tree automata An- For 
instance, if 17 = {a, b, h{), /(, )}, then 

A,b,h{),fi,)},Q^{qo 

a go, b ^ go, 

1i 

filiTlj) qmax{i,j} 

gi+i 

, /(?n + l, ?n + l) g„ + i 

Definition 5. A tree-regular language L is said to be n rank-bound if for any 
term t (E L we have rank(f) < n. 

Theorem 4. Any rank-bound regular language is reeognized by a rank-bound 
tree automata. The language reeognized by a rank-bound tree automata is a rank- 
bound tree-regular language. 



An — \ 



^ = I 



Zl= 



, g«, g«+i}, Q/ — {go, gn}, 

for any i £ [0..n -|- 1] 

for any i, j £ [0..n -|- 1] with i fz. j 

for any i £ [0..n] 



5 Extending the Results to Higher-Order Unification 

In Section 3 we have shown how linear second-order unification can be reduced 
to context unification with tree-regular constraints. In this section we discuss 
whether this result could be extended to linear higher-order unification. 

Higher-order unification can be defined as the problem of finding a substitu- 
tion (7 making the normal form of two instances of terms (j(s) and (j(f) equal. 
When we try to find such a substitution we have to take into account how this 
terms will /?-reduce after being instantiated. The problem is simple in linear 
second-order. We know that any instance of F{ti, ...,tn), after /?-reduction, will 
contain aifi) as subterms, and representing cr{Fiti, ...,tn)) as a tree, all nodes 
corresponding to cr{F) will be connected, forming a context. In third-order the 
situation is more complicate. First, we have to require instances of variables to be 
linear in all A-bindings, i.e. not only in the most external A-bindings. If we apply 
the substitution F i— \y.\z.gifyijj 2 {z))) to F{Xx.f{x),a) we get < 7 i(/(</ 2 (a)))- 
The nodes corresponding to F are no longer connected: <r{F) is broken into 
pieces, and some of the arguments can also disappear. Each one of such pieces 
forms a kind of context. For instance, if T : (o — o) — o — o, any instance of 
F{ti,t' 2 ) has one of the following forms: 
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Each one of these situations is captured respectively by: 

F I— 1 \x ,\y ,Fo{\z .x{z) , y) 

F H-l \x.\y.FQ{\z.x{Fi{z)),y) 

F H-l \x.\y.Fo{\z.x{Fi{y,z))) 

In this example, Fq is still a third-order typed variable. Moreover, the hrst 
instantiation is f i— 1 Eq in normal form, so it subsumes the other two. The second 
one is equal to the hrst one, if z contains a single variable and we instantiate 
Fi I— 1 \z.z. In fact, this classihcation only makes sense if we translate Eq into 
a context variable using the method described in Section 3 for higher-order 
constants. We would get: 

F 1-1 \x.\y.F[^{dz,x{cz),y) 

F 1-1 \x.\y.Fl^{X^,x{Fi{c^)),y) 

F 1-1 \x.\y.F^{X^,x{Fi{y,c^))) 

The variable X^ encodes the binding \z. If we were able to know a priori 
how long and, with which types, can be these A-bindings, then the translation 
would not seem much more complicate than in the second-order case. 
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Abstract. We investigate word problems and confluence problems for 
the following four classes of terminating semi-Thue systems: length- 
reducing systems, weight-reducing systems, length-lexicographic systems, 
and weight-lexicographic systems. For each of these four classes we de- 
termine the complexity of several variants of the word problem and con- 
fluence problem. Finally we show that the variable membership problem 
for quasi context-sensitive grammars is EXPSPACE-complete. 



1 Introduction 

The main purpose of semi-Thue systems is to solve word problems for finitely 
presented monoids. But since there exists a fixed semi-Thue system TZ such 
that the word problem for the monoid presented by the set of equations that 
corresponds to TZ (in the following we will briefly speak of the word problem for 
TZ) is undecidable [Mar47, Pos47], also semi-Thue systems cannot help for the 
effective solution of arbitrary word problems. This motivates the investigation 
of restricted classes of semi-Thue systems which give rise to decidable word 
problems. One of the most prominent class of semi-Thue systems with a decidable 
word problem is the class of all terminating and confluent semi-Thue systems. 
But if we want to have efficient algorithms for the solution of word problems 
also this class might be too large: It is known that for every n > 3 there exists 
a terminating and confluent semi-Thue system TZ such that the (characteristic 
function of the) word problem for TZ is contained in the nth Grzegorczyk class but 
not in the (n— l)th Grzegorczyk class [B084]. Thus the complexity of the word 
problem for a terminating and confluent semi-Thue system can be extremely 
high. One way to reduce the complexity of the word problem is to force bounds 
on the length of derivation sequences. Such a bound can be forced by restricting 
to certain subclasses of terminating systems. For instance it is known that for a 
length-reducing and confluent semi-Thue system the word problem can be solved 
in linear time [Boo82] . On the other hand in [Loh99] it was shown that a uniform 
variant of the word problem for length-reducing and confluent semi-Thue systems 
(where the semi-Thue system is also part of the input) is P-complete. 

In this paper we will continue the investigation of the word problem for 
restricted classes of terminating and confluent semi-Thue systems. We will study 
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the following four classes of semi-Thue systems, see e.g. also [B093], pp 41-42: 
length-reducing systems, weight-reducing systems, length-lexicographic systems, 
and weight-lexicographic systems. Let C be one of these four classes. We will 
study the following five decision problems for C: (i) the word problem for a fixed 
confluent TZ & C, where the input consists of two words, (ii) the uniform word 
problem for a fixed alphabet if, where the input consists of a confluent semi- 
Thue system TZ G C over the alphabet E and two words, (iii) the uniform word 
problem, where the input consists of a confluent TZ £ C and two words, (iv) the 
confluence problem for a fixed alphabet if, where the input consists of a semi- 
Thue system TZ G C over the alphabet E, and finally (v) the confluence problem, 
where the input consists of an arbitrary TZ G C. For each of the resulting 20 
decision problems we will determine its complexity, see Table 1, p 182, and Table 
2, p 183. Finally we consider a problem from [BL94], the variable membership 
problem for quasi context-sensitive grammars. This problem was shown to be in 
EXPSPACE but NEXPTIME-hard in [BL94]. In this paper we will prove that 
this problem is EXPSPACE-complete. We assume that the reader is familiar 
with the basic notions of complexity theory, in particular with the complexity 
classes P, PSPACE, EXPTIME, and EXPSPACE, see e.g. [Pap94]. 

2 Preliminaries 

In the following let X be a finite alphabet. The empty word will be denoted by 
e. A weight-function is a homomorphism / : E* — > N from the free monoid E* 
with concatenation to the natural numbers with addition such that /(s) = 0 if 
and only if s = e. The weight-function / with /(a) = 1 for all a G X is called the 
length-function. In this case for a word s G X* we abbreviate /(s) by |s| and call 
it the length of s. Furthermore for every a G X we denote by |s|a the number of 
different occurrences of the symbol a in s. For a binary relation — > on some set, 
we denote by i (^) the transitive (reflexive and transitive) closure of 

In this paper, a deterministic Turing-machine is a tuple A4 = {Q, X, <5, qf), 
where Q is the finite set of states, X is the tape alphabet with X n Q = 

S : {Q\{qf}) X E ^ Q X E X {—1, -1-1} is the transition function, where —1 (-1-1) 
means that the read- write head moves to the left (right), qo G Q is the initial 
state, and qf G Q is the unique final state. The tape alphabet X always contains 
a blank symbol □. We assume that A4 has a one-sided infinite tape, whose 
cells can be identified with the natural numbers. Note that A4 cannot perform 
any transition out of the final state qf. These assumptions do not restrict the 
computational power of Turing-machines and will always be assumed in this 
paper. An input for A4 is a word w G (X\{D})*. A word of the form uqv, where 
u,v G E* and q G Q, codes the configuration, where the machine is in state q, the 
cells 0 to |uv| — 1 contain the word uv, all cells k with k > \uv\ contain the blank 
symbol □, and the read- write head is scanning cell |u|. We write sqt upv 
if A4 can move in one step from the configuration sqt to the configuration upv, 
where q,p G Q and st,uv G E*. The language that is accepted by A4 is deflned by 
L(j\4) = {w G (X\{n|)* I 3u,v G E* : qow uqfv}. Note that w G L(A4) 
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if and only if A 4 terminates on the input w. A deterministic linear hounded 
automaton is a deterministic Turing-machine that operates in space n -I- 1 on an 
input of length n. 

A semi-Thue system TZ over A, briefly STS, is a finite set TZ C E* x E*, 
whose elements are called rules. See [B 093 ] for a good introduction to the theory 
of semi-Thue systems. A rule {s,t) will also be written as s ^ t. The sets 
dom(T^) of all left-hand sides and ran(T^) of all right-hand sides are defined by 
dom(T^) = {s I : (s, t) G TZ} and ran(T^) = {t | 3 s : (s, t) G TZ}. We define the 
two binary relations — and <->7^ as follows, where x,y G E*: 

— X —>7?, y if there exist u,v G E* and (s, t) G TZ with x = usv and y = utv. 

— X ^Tz y if {x y or y x). 

The relation is a congruence relation with respect to the concatenation of 
words, it is called the Thue- congruence associated with TZ. Hence we can define 
the quotient monoid E* / which is briefiy denoted by E* jTZ. We say that 
TZ is terminating if there does not exist an infinite sequence of words Sj G E* 
{i G N) with So ^Tz Si S2 • • • • The set of irreducible words with respect 
to TZ is IRR(T^) = E*\{stu G E* \ s,u G E* ,t G dom( 7 ^)}. A word t is a normal 
form of s if s -^-jz t G IRR(T^). We say that TZ is confluent if for all s,t,u G E* 
with s '^Tz t and s ^^-jz u there exists w G E* with t ^-jz w and u -^-jz w. We 
say that TZ is locally confluent if for all s, t, u G E* with s ^-jz t and s -^rz u 
there exists w G E* with t -^-jz w and u ->-tz w. If 7 ^ is terminating then by 
Newman’s lemma [New 43 ] TZ is confluent if and only if TZ is locally confluent. 

Two decision problems that are of fundamental importance in the theory of 
semi-Thue systems are the (uniform) word problem and the confluence problem. 
Let C be a class of STSs. The uniform word problem, briefiy UWP, for the class 
C is the following decision problem: 

INPUT: An STS TZ G C (over some alphabet E) and two words u,v G E*. 
QUESTION: Does u v hold? 

The confluence problem, briefiy CP, for the class C is the following decision 
problem: 

INPUT: An STS TZgC. 

QUESTION: Is TZ confiuent? 

The UWP for a singleton class { 7 ^} is called the word problem, briefiy WP, for 
the STS TZ. 

For the class of all terminating STSs the CP is known to be decidable [NB 72 ] . 
This classical result is based on the so called critical pairs of an STS, which result 
from overlapping left-hand sides. A pair (si, S2) G E* x E* is a critical pair of TZ 
if there exist rules (ti,ui), (0,^2) G TZ such that one of the following two cases 
holds: 

— t\ = vt2W, Si = ui, and S2 = VU2W for some v,w G E* (here the word 
ti = vt2W is an overlapping of U and 12). 

— ti = vt, t2 = tw. Si = u\w, and S2 = VU2 for some t,v,w G E* with t ^ e 
(here the word vtw is an overlapping of t\ and 12). 
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Note that there are only finitely many critical pairs of TZ. In order to check 
whether a terminating STS TZ is confluent it suffices to calculate for every critical 
pair (s, t) of TZ arbitrary normal forms of s and t. If for some critical pair these 
normal forms are not identical then TZ is not confluent, otherwise TZ is confluent. 

Similarly for the class of all terminating and confluent STSs the UWP is 
decidable [KB67]: In order to check whether s t holds for given words 
s,t £ S* we compute arbitrary normal forms of s and t. Then s t if and 
only if these normal forms are the same. 

In this paper we consider the following classes of terminating systems. An 
STS TZ is length-reducing a |s| > |t| for all (s, t) £ TZ. An STS TZ is weight-reducing 
if there exists a weight-function / such that /(s) > f{t) for all (s, t) £ TZ. An 
STS TZ is length-lexicographic if there exists a linear order on the alphabet 
S such that for all (s,t) S 7^ it holds |s| > |f| or (|s| = |t| and there exist 
u,v,w £ S* and a,b £ S with s = uav, t = ubw, and a >- b). An STS TZ 
is weight-lexicographic if there exist a linear order on the alphabet S and a 
weight-function / such that for all (s, t) £TZii holds /(s) > f{t) or (/(s) = f{t) 
and there exist u,v,w £ S* and a,b £ S with s = uav, t = ubw, and a >- b). 

For all these classes, restricted to confluent STSs, the UWP is decidable. 
Since we want to determine the complexity of the UWP for these classes, we 
have to define the length of an STS. For an STS TZ it is natural to define its 
length ||7^|| by ||7^|| = 'E(^s,t)en N^l- 

3 Length-Reducing Semi-Thue Systems 

In [Loh99] it was shown that the UWP for the class of all length-reducing and 
confluent STSs over {a, b} is P-complete. In this section we prove that for a 
fixed STS the complexity decreases to LOGCFL. Recall that LOGCFL is the 
class of all problems that are log space reducible to the membership problem for 
a context-free language [Sud78]. It is strongly conjectured that LOGGFL is a 
proper subset of P. 

Theorem 1. Let TZ be a fixed length-reducing and confluent STS. Then the WP 
for TZ is in LOGCFL. 

Proof. Let 7^ be a fixed length-reducing STS over S and let s £ E* . From 
the results of [DW86] ^ it follows immediately that the following problem is in 
LOGGFL: 

INPUT: A word t£S*. 

QUESTION: Does t s hold? 

Now let 7^ be a fixed length-reducing and confluent STS over E and let u,v £ E* . 
Let E = {a \ a £ E} he a, disjoint copy of E. For a word s £ E* define the word 

^ The main result of [DW86] is that the membership problem for a fixed growing 
context-sensitive grammar is in LOGCFL. Note that the uniform variant of this 
problem is NP-complete [KN87, CH90, BL92]. 
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^rev g inductively by = e and a for a G S and t G S* . Define 

the length-reducing STS V by 

V = TZU {s''®'' ^ I (s, t) G TZ} U {aa ^ e | a € S}. 

Since TZ is confluent, it holds u ^-jz v if and only if uTJ'®'' -^-p e. The later 
property can be checked in LOGCFL. Clearly uT''®'' can be constructed in log 
space from u and v. □ 



4 Weight-Reducing Semi-Thue Systems 



'Weight-reducing STSs were investigated for instance in [Die87, Jan88, N088] 
and [BL92] as a grammatical formalism. The WP for a fixed weight-reducing 
and confiuent STS can be easily reduced to the 'WP for a fixed length-reducing 
and confiuent STS. Thus the WP for every fixed weight-reducing and confiuent 
STS can also be solved in LOGCFL: 

Theorem 2. LetTZ be a fixed weight-reducing and confluent STS. Then the WP 
for TZ is in LOGCFL. 

Proof. Let 77. be a weight-reducing and confiuent STS over E and let u,v G S* . 
Let / be a weight-function such that /(s) > f{f) for all (s, t) G TZ. Let $ ^ E and 
define the morphism (p : E* ^ {E U {$})* by ip{a) = for all a G E. 

Note that non-trivial overlappings between two words <p{a) and <p(b) are not 
possible. It follows that the STS (p{TZ) = |v3(s) ^ (p{t) \ (s, t) G TZ} is length- 
reducing and confiuent, and we see that u v if and only if <p{u) ^ip{p) f{v). 
Since p(u) and (f(y) can be constructed in log space, the theorem follows from 
Theorem 1 . □ 



Next we will consider the UWP for weight-reducing and confiuent STSs over 
a fixed alphabet E. In order to get an upper bound for this problem we need 
the following lemma, which we state in a slightly more general form for later 
applications. 



Lemma 1. Let E he a finite alphabet with 1271 = n and let TZ be an STS over 
E with a = max{|s|a | s G dom{TZ) U ran{TZ), a G 27}. Let g he a weight-function 
with g{s) > g{t) for all (s, t) G TZ. Then there exists a weight-function f such 
that for all (s, t) G TZ the following holds: 

- If g{s) > g{f) then f{s) > f{t), and if g{s) = g{t) then f{s) = f{t). 

~ f{o) E {n -\- l)(o;n)” for all a G E. 



Proof. We use the following result about solutions of integer (in) equalities from 
[vZGS78]: Let A,B,C,D be (m x n)-, (m x 1)-, (p x n)-, {p x l)-matrices, 

respectively, with integer entries. Let r = rank(4), s = rank ^ 

an upper bound on the absolute values of all (s — 1) x (s — 1)- or (s x s)- 

subdeterminants of the {m-\-p) x {n-\- l)-matrix , which are formed with 
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at least r rows from the matrix {A B). Then the system Ax = B, Cx > D 
has an integer solution if and only if it has an integer solution x such that the 
absolute value of every entry of x is bounded by (n + 1)M. 

Now let B, n, TZ, a, and g be as specified in the lemma. Let E = {oi, . . . , a„} 
and TZ = {{si,ti) \ 1 < z < fc} U {{ui,Vi) | 1 < z < ^}, where g{si) = g{ti) for 
1 < i < k and g{ui) > g{vi) for 1 < z < ^. Define the (fc x rz)-matrix A by 
A,j = IsiUj - \U\aj and define the x rz)-matrix C' by CD = \ui\a^ - \vi\ay 

Let C = , where Id„ is the (zz x n)-identity matrix. Finally let (j)i be the 



z-dimensional column vector with all entries equal to j. Then the rz-dimensional 
column vector x with Xj = giflj) is a solution of the following system: 



Ax = (0)fc Cx > {l)i+n 



( 1 ) 



Note that r = rank(Tl) < rz and s = rank 

V- 



< rz. Furthermore every entry of 



the matrix E = 



is bounded by a. Thus the absolute value of every 



'A (0)fc 

(l)i+nj 

(s — 1) X (s — 1)- or (s X s)-subdeterminant of E is bounded by s! • < (an)'^. 

By the result of [vZGS78] the system (1) has a solution y with yj < (rz+ l)(arz)” 
for all 1 < j < zz. If we define the weight-function / by fiflj) = yj then / has 
the properties stated in the lemma. □ 



Theorem 3. Let B be a fixed alphabet with |i7| > 2. Then the UWP for the 
class of all weight-reducing and confluent STSs over B is P-complete. 

Proof. Let 1271 = rz > 2. Let 7^ be a weight-reducing and confiuent STS over 
B and let u,v G B*. By Lemma 1 there exists a weight-function / such that 
f(s) > f{t) for all (s, t) GTZ and f{a) < (rz -b l)(azz)" for all a G B. Thus every 
derivation that starts from the word u has a length bounded by |zz| • (zz-|-l) • (azz)”, 
which is polynomial in the input length ||7^|| -|- |rzv|. Thus a normal form of u 
can be calculated in polynomial time and similarly for v. This proves the upper 
bound. P-hardness follows from the fact that the UWP for the class of all length- 
reducing and confiuent STSs over {a,b} is P-complete [Loh99]. □ 



Finally for the class of all weight-reducing and confiuent STSs the complexity of 
the UWP increases to EXPTIME: 



Theorem 4. The UWP for the class of all weight-reducing and confluent STSs 
is EXPTIME-complete. 

Proof. The EXPTIME-upper bound can be shown by using the arguments from 
the previous proof. Just note that this time the upper bound of (rz-b l)(azz)” for 
a weight-function is exponential in the length of the input. For the lower bound 
let M = (Q, 27, S, go, qf) be a deterministic Turing-machine such that for some 
polynomial p it holds: If zc S L{M) then M, started on w, reaches the final state 
Qf after at most 2^(1™!) many steps. Let w G (27\{D})* be an arbitrary input for 
A4. Let rzz = p(|zc|) and let 

m 

U = QU27u|J(27iU 2«, B,}) U {#, >}. 

2=0 
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Here 2^*^ is a single symbol and Si = {at \ a G S} is a disjoint copy of S for 
0 < i < m. Let S^, = S LI {[>} and let TZ be the STS over F that consists of the 
following rules: 



(1) 


2^06 


^ a^2(*-i) • 


••2(i)2(o)6 for 


0 


< 


i < m. 


d G 


bG S 




(2) 


2«afc 


^ afc-i2« 


for 


0 


< 


i < m. 


1 < A: < 


m, a G Sfy 


(3) 




a# 


for 


0 


< 


k < m. 


d G 






(4) 


#X ^ 


■ X 


for 


X 


G 


SLQ\{qf} 






(5) 


2Wg - 


<1 


for 


q 


G 


Q\Uf} 








(6) 


2^'^^cqa cbp 


for 


0 


< 


i <m. 


c G 


5{q,a) = 


{p, 6, -hi) 


(7) 


2^'‘'>cqa pcb 


for 


0 


< 


i < m. 


c G 


S{q,a) = 


(P, -1) 


(8) 




Ai+iAi+i 


for 


0 


< 


i <m 








(9) 


B,^ 


Bi+iBi+i 


for 


0 


< 


i < m 








(10) 


Am — ^ 


> #2(™) 
















(11) 


Bm — 


> □ 
















(12) 


xqf - 


* 9/ 


for 


X 


G 


r 








(13) 


qfX- 


> qf 


for 


X 


G 


r 









We claim that TZ is weight-reducing. For this we define the weight-function / as 
follows: ^ 



f{Ai) = 2 • f{Ai+i) -I- 1 for 0 < i < m 
f{Bi) = 2 • f{Bi+i) -I- 1 for 0 < i < m 



/(x) = 1 for X G Q U U {#} 

X ~\~ 1 

/(oi) = 1-1 for 0 < i < m, a G S^ 

m + 2 



f(A^) =2^ + 2 
f(Bm) = 2 

/(2^*^) = 2* for 0 < i < m 



Then it is easy to check that /(s) > f{t) for all (s, t) G TZ. All non-trivial critical 
pairs of TZ are of the form (sg/, tqf) (where (sx, t) G TZ, x G S), {q/s, qft) (where 
(xs,t) G TZ, X G S), or {xqf,qfy) (where x,y G S). By the rules in (12) and 
(13) both components of these critical pairs can be reduced to qf. Thus TZ is 
confluent. Finally we claim that Aq t> qowBo ^-r, qf if and only if w G L{A4). 

Before we prove this claim let us first explain the effect of the rules from 
TZ. For 0 < i < 2"* let sum(z) = 2^*^^ • • • 2^**“^ G T* if A > • • • > and 
i = 2*1 -I- • • • -I- 2®'“ (note that sum(O) = e). Let us call a word of the form 
#sum(z) G r* a counter with value z. The effect of the rules in (1), (2), and 
(3) is to move counters to the right in words from oA*. Here the symbol > is 
a left-end marker. If a whole counter moves one step to the right, its value is 
decreased by one. More generally for all u G S* , b G S, and all |zz| < z < 2™ we 
have #sum(z)i>'u6 -^r, c>zz#sum(z— |zz| — 1)6. If a counter has reached the value 0, 
i.e, it consists only of the symbol # then the counter is deleted with a rule in (4). 
Also if a counter collides with a state symbol from Q at its right end, then the 
counter is deleted with the rules in (4) and (5). Note that such a collision may 
occur after an application of a rule in (7). The rules in (6) and (7) simulate the 
machine A4 . In order to be weight-reducing, these rules consume the right-most 

Here we use rational weights, but of course they can be replaced by integer weights. 



2 
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symbol of the right-most counter. The rules in (8) and (10) produce 2"^ many 
counters of the form Each of these counters can move at most 2™ cells 

to the right. But since M terminates after at most 2™ many steps, the distance 
between the left end of the tape and the read- write head is also at most 2"^. 
This implies that with each of the 2™ many counters that are produced from 
Aq, at least one step of M can be simulated. The rules in (9) and (11) produce 
2™ many blank symbols, which is enough in order to simulate 2"^ many steps of 
Ai. Finally the rules in (12) and (13) make the final state qf absorbing. 

Nowific G L{M) then AoOqowBo uqfV -^-jz qf 

for some u,v G F*. On the other hand if w ^ L{Ai) then Ai does not terminate 
on w. By simulating Ai long enough, and thereby consuming all 2™ many initial 
counters, we obtain Aq > qowBo ^-jz \> >uqv G IRR(T^) 

for some q G Q\{g/}, u,v G S* . Since also qf G IRR(T^) and TZ is confluent, 
Aq > qowBo ^Tz qf cannot hold. □ 

Since P is a proper subclass of EXPTIME, it follows from Theorem 3 and The- 
orem 4 that in general it is not possible to encode the alphabet of a weight- 
reducing and confluent STS into a fixed alphabet with a polynomial blow-up 
such that the resulting STS is still weight-reducing and confluent. For length- 
reducing systems this is always possible, see [Loh99] and the coding function 
from the proof of Theorem 5. 

5 Length-Lexicographic Semi-Thue Systems 

In this section we consider length-lexicographic semi-Thue systems, see for in- 
stance [KN85]. The complexity bounds that we will achieve in this section are 
the same that are known for preperfect systems. An STS TZ is preperfect if for 
all s, t G S* it holds s ^-jz t if and only if there exists u G S* with s ^tz u and 
t ^Tz u, where the relation i— >7^ is defined by v i-^-jz w if v ^-jz w and |f| > |w|. 
Since every length-preserving STS is preperfect and every linear bounded au- 
tomaton can easily be simulated by a length-preserving STS, there exists a fixed 
preperfect STS TZ such that the WP for TZ is PSPACE-complete [BJM+81]. The 
following theorem may be seen as a stronger version of this well-known fact in 
the sense that a deterministic linear bounded automaton can even be simulated 
by a length-lexicographic, length-preserving, and confluent STS. 

Theorem 5. The WP for a length-lexicographic and confluent STS is contained 
in PSPACE. Furthermore there exists a fixed length-lexicographic and confluent 
STS TZ over {a, b} such that the WP for TZ is PSPACE-complete. 

Proof. The first statement of the theorem is obvious. For the second statement 
let Ai = (Q, A, S, qo, qf) be a deterministic linear bounded automaton such that 
the question whether w G L(Ai) is PSPACE-complete. Such a linear bounded 
automaton exists, see e.g. [B084]. We may assume that Ai operates in phases, 
where a single phase consists of a sequence of 2 • n transitions of the form 
qiwi W2q2 Q3W3, where wi,W2, W3 G S* and qi,q2, 93 G Q. During the 
sequence qiWi =>» W292 only right-moves are made, and during the sequence 
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W 2 Q 2 Q 3 W 3 only left-moves are made. A similar trick is used for instance 
also in [CH90]. Let c > 0 be constant such that if w S L{M) then M, started on 
w, reaches the final state qf after at most 2° ” phases. Let w € (i7\{n})* be an 
input for M with licl = n. As usual let A be a disjoint copy of A and similarly 
for Q. Let F = and let TZ be the STS over F that 

consists of the following rules: ^ 



Og- 


>ql 


for all g G Q 


Iq- 


> Qq 


for all g G Q 


ql- 


>l_q 


for all g G Q 


qa — 


> bp 


if 6{q,a) = (p,b,+l) 


q< - 


> q< 


for all q G Q\{qf} 


aq - 


■> pb 


if 6{q,a) = {p,b,-l) 


xqf - 


<lf 


for all a; G A 


qfX - 


<lf 


for all a; G A 



First we claim that TZ is length-lexicographic. For this choose a linear order 
on the alphabet F that satisfies (here for instance Q > 1 

means that q >- 1 for every q G Q). Furthermore TZ is confluent. Finally we 
claim that 10° "( 7 ow< qf if and only if w e L{M). For v = bk ■ ■ ■ bo G {0, 1}* 
{bi G {0, 1}) let val(u) = Note that for every q G Q and s,t G {0, 1}+ 

with s ^ QI'*! it holds sq tq if and only if |s| = \t\ and val(t) = val(s) — 1. First 
assume that w G L{M). Then 10°"(7ow< vqfU< qf for some u G S* 
and V G {0, 1}+. Now assume that w ^ F(A4). Then A4 does not terminate on w 
and we obtain 10^'"qow<i ql'^ G IRR(T^), where u G S* 

and q G Q\{qf}. Since also qf G IRR(T^) and TZ is confluent, 10°"(7ow< qf 
cannot hold. 

Finally, we have to encode the alphabet F into the alphabet {a, b}. For this 
let F = {oi, . . . ,afc} and let a\ >- Q 2 >-•••>- ak be the chosen linear order 
on F. Define a morphism (p : F* — > {a, b}* by ip{ai) = and let 

a> b. Then the STS p{TZ) is also length-lexicographic and confluent and for all 
u,v G F* it holds u v if and only if ip{u) ^ip(n) see [B093], p 60. □ 

6 Weight-Lexicographic Semi-Thue Systems 

The widest class of STSs that we study in this paper are weight-lexicographic 
STSs. Let 7^ be a weight-lexicographic STS over an alphabet A with |A| = n 
and let u G A*. Thus there exists a weight-function / with /(s) > f{t) for all 
(s, t) G 7^. If u = uo — u\ — Un is some derivation then for all 
0 < i < n it holds |ui| < f{ui) < f{u). By Lemma 1 we may assume that 
/(a) < {n+ l)(o;n)" for all a G A and thus \ui\ < |u| • {n + l)(cm)". Together 
with Theorem 5 it follows that the UWP for weight-lexicographic and confluent 
STSs over a fixed alphabet is PSPACE-complete and furthermore that there 

® It will be always clear from the context whether e.g. 1 denotes the symbol 1 G F or 
the natural number 1. 
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exists a fixed weight-lexicographic and confiuent STS whose WP is PSPACE- 
complete. For arbitrary weight-lexicographic and confiuent STSs we have the 
following result. 

Theorem 6. The UWP for the class of all weight-lexicographic and confiuent 
STSs is EX PS PACE- complete. 

Proof. The EXPSPACE-upper bound can be shown by using the arguments 
above. For the lower bound let A4 = (Q, S, 5, q^, qf) be a deterministic Turing- 
machine which uses for every input w at most space where p is some 

polynomial. Similarly to the proof of Theorem 5 we may assume that M operates 
in phases. There exists a polynomial q such that if m € L{A4) then Ai, started 
on w, reaches qf after at most 2^’ ” many phases. Let w G (X’\{D})* be an 
arbitrary input for AA. Let m = pdicj), n = g(|rc|), and 

T = Q U Q U A U A U {<, 0, 1, T} U {Ai I 0 < z < n} U {Bi I 0 < i < m}. 

Let TZ be the STS over P that consists of the following rules: 



Oq- 


>ql 


for all g G Q 


Iq- 


> Qq 


for all g G Q 


ql- 


>l_q 


for all g G Q 


qa — 


> bp 


if 6{q,a) = {p, 6,-bl) 


q<- 


> q< 


for all q G Q\{qf} 


aq - 


■> pb 


if S{q,a) = {p, 6,-1) 


Ai — 


^ Ai+iAi^i 


for 0 < z < rz 


B,- 


* Bi+\Bi+i 


for 0 < z < m 


An - 


-4 0 




Bm - 


□ 




xqf - 


qf 


for all X G A 


qfX - 


qf 


for all X G A 



Note that the first six rules are exactly the same rules that we used for the 
simulation of a linear bounded automaton in the proof of Theorem 5 . We claim 
that TZ is weight-lexicographic. For this define the weight-function / by f{x) = 
1 for all X G Q U Q U A U A U {<, 0, 1) Ij A„, Bm} and f{Ai) = 2 • /(A^+i), 
f{B,) = 2 • for 0<i<n, 0<j<m. Then the last two rules are 

weight-reducing and all other rules are weight-preserving. Now choose a linear 
order ^ on P that satisfies Q y 1 ^ 0 ^ S ^ Q, Aq y Ai ^ ■ y An ^ 0, and 

Bq > Bi > ■ ■ ■ >■ Bm >- □. It is easy to see that TZ is confiuent. Finally we have 
w G L{A4) if and only if lAoqowBo< Qf - This can be shown by using the 
arguments from the proof of Theorem 5. Just note that this time from the word 
IAq we can generate the word 10^ which allows the simulation of 2^ many 
phases. Analogously to the proof of Theorem 4 the symbol Bq generates enough 
blank symbols in order to satisfy the space requirements of A4 . □ 

7 Confluence Problems 

The CP for the class of all STSs is undecidable [B084] . On the other hand, the 
CP for the class of all terminating STSs is decidable [NB72] . For length-reducing 
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STSs the CP is in P [B081], the best known algorithm is the 0(||7^P)-algorithm 
from [KKMN85]. Furthermore in [Loh99] it was shown that the CP for the class 
of all length-reducing STSs is P-complete. This was shown by using the following 
log space reduction from the UWP for length-reducing and confluent STSs to the 
CP for length-reducing STSs, see also [VRL98] , Theorem 24: Let 7^ be a length- 
reducing and confluent STS over S. Furthermore let A and B be new symbols. 
Then for all s,t & the length-reducing STS TZ U — > s, A^’^^'^B t} is 

confluent if and only if s t holds. Finally the alphabet E U {A, B} can be 
reduced to the alphabet {a, 6} by using the coding function from the end of the 
proof of Theorem 5. The same reduction can be also used for weight-reducing, 
length-lexicographic, and weight-lexicographic STSs. Thus a lower bound for the 
UWP for one of the classes considered in the preceding sections carrys over to the 
CP for this class. Furthermore also the given upper bounds hold for the CP for 
the corresponding class: Our upper bound algorithms for UWPs are all based on 
the calculation of normal forms. But since every STS has only polynomially many 
critical pairs, any upper bound for the calculation of normal forms also gives an 
upper bound for the CP. The resulting complexity results are summarized in 
Table 1. 



Table 1. Complexity results for confluence problems 





length-reducing 

STSs 


weight-reducing 

STSs 


length- 

lexicographic 

STSs 


weight- 

lexicographic 

STSs 


CP for a fixed 
alphabet 


P-complete 


P-complete 


PSPACE- 

complete 


PSPACE- 

complete 


CP 


P-complete 


EXPTIME- 

complete 


PSPACE- 

complete 


EXPSPACE- 

complete 



8 Quasi Context-Sensitive Grammars 

A quasi context-sensitive grammar, briefly QCSG, is a (type-0) grammar G = 
{N, T, S, P) (here N is the set of non-terminals, T is the set of terminals, S G N 
is the start non-terminal, and P C (7VU T)* N (N U T)* x (A^U T)* is a finite set 
of productions) such that for some weight-function / : {N U T)* ^ N we have 
f{u) < f{v) for all {u,v) G P, see [BL94]. The variable membership problem for 
QCSGs it the following problem: 

INPUT : A QCSG G with terminal alphabet T and a terminal word v G T* . 
QUESTION: Does v G L{G) hold? 

In [BL94] it was shown that this problem is in EXPSPACE and furthermore 
that it is NEXPTIME-hard. Using some ideas from Section 4 we can prove that 
this problem is in fact EXPSPACE-hard. 
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Theorem 7. The variable membership problem for QCSGs is EXPSPACE-com- 
plete. 

Proof. It remains to show that the problem is EXPSPACE-hard. For this let 
M = (Q, X, (5, qo, qf) be a Turing-machine, which uses for every input w at most 
space — 2 for some polynomial p. Let w G (X’\{D})* be an input for A4 and 
let m = p(|ic|). We will construct a QCSG G = {N, T, S, P) and a word v G T* 
such that w G L{M.) if and only if u G L(G). The non-terminal and terminal 
alphabet of G are N = {S', B} U {Ai | 0 < z < m} U Q U X and T = {Am}- The 
set P consists of the following productions: 





qowB 




B 


OB 




qa 


■ bp 


if 5{q,a) = (p,6,-bl) 


cqa - 


-> pcb 


if 5{q,a) = (p,5, -1), cG X 


qf 


Ao 




xAq 


AqAq 


for X G X 


Aox 


AqAq 


for a; G X U {B} 


AiAi 


— *■ Aipi 


for 0 < i < m 



In order to show that G is quasi context-sensitive we define the weight-function 
/ : (TV U T)* ^ N by f{x) = 1 for all x G (S, B} LI X LI Q and /(A,) = 2* for all 
0 < i < m. Then /(s) < f{t) for all (s,t) G P. If w G L{M) then Am G L{G) 
by the following derivation: 

S ^ qowB qowO'^ sqftB sAotB A^ -^p Am, 

where s,t G E* and |st| = 2™ — 2. On the other hand, if Am G L{G) then a 
sentential form uqfV with uv G P* and \uv\ = 2"^ — 1 must be reachable from S, 
i.e, reachable from qowB. This is only possible if w G L{A4). Furthermore G and 
V can be calculated from A4 and w in log space. This concludes the proof. □ 

Note that from the previous proof it follows immediately that the following 
problem is also EXPSPACE-complete: 

INPUT: A context-sensitive grammar G with terminal alphabet {a} and a num- 
ber n G N coded in binary. 

QUESTION: Does a" G L{G) hold? 

The same problem for context-free grammars is NP-complete [Huy84] . 



9 Summary and Open Problems 

The complexity results for WPs are summarized in Table 2. Here the statement 
in the first row that e.g. the WP for length-lexicographic and confluent STSs is 
PSPACE-complete means that for every length-lexicographic and confluent STS 
the WP is in PSPACE and furthermore there exists a fixed length-lexicographic 
and confluent STS whose WP is PSPACE-complete. Furthermore the complete- 
ness results in the second row already hold for the alphabet {a, b}. 
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Table 2. Complexity results for word problems 





length-reducing 
& confluent 
STSs 


weight-reducing 
& confluent 
STSs 


length- 

lexicographic & 
confluent STSs 


weight- 

lexicographic & 
confluent STSs 


WP 


LOGCFL 


LOGCFL 


PSPACE- 

complete 


PSPACE- 

complete 


UWP for a 
fixed alphabet 


P-complete 


P-complete 


PSPACE- 

complete 


PSPACE- 

complete 


UWP 


P-complete 


EXPTIME- 

complete 


PSPACE- 

complete 


EXPSPACE- 

complete 



One open question that remains concerns the WP for a fixed length-reducing 
(weight-reducing) and confluent STS. Does there exist such a system whose WP 
is LOGCFL-complete or are these WPs always contained for instance in the sub- 
class LOGDCFL, the class of all languages that are log space reducible to a de- 
terministic context-free language? Since there exits a fixed deterministic context- 
free language whose membership problem is LOGDGFL-complete [Sud78], The- 
orem 2.2 of [MN088] implies that there exists a fixed length-reducing and con- 
fluent STS whose WP is LOGDGFL-hard. 

Another interesting open problem is the descriptive power of the STSs con- 
sidered in this paper. Let Aiir (Mwr, -Mu, Mwt) be the class of all monoids 
(modulo isomorphism) of the form S* jTZ, where 77. is a length-reducing (weight- 
reducing, length-lexicographic, weight-lexicographic) and confluent STS over E. 
In [Die87] it was shown that the monoid {a, 6, c}*/{o6 — > c^} is not contained 
in M-ir- Since the STS {ah c^} is of course confluent, weight-reducing, and 
length-lexicographic, it follows that Mtr is strictly contained in and Mu. 
Furthermore the monoid {a, b}* / {ab — > ba} is contained in A4u\Mwr [Die90], p 
90. If there exists a monoid in M.wr\Mu then Mwr and Mu are incomparable 
and both are proper subclasses of MwH- But we do not know whether this holds. 

Finally, another interesting class of rewriting systems, for which (uniform) 
word problems and confluence problems were studied, is the class of commuta- 
tive semi-Thue systems, see for instance [Gar75, Huy85, Huy86, Loh99, MM82, 
VRL98] for several decidability and complexity results. But there are still many 
interesting open questions, see for instance the remarks in [Huy85, Huy86, Loh99] . 
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Abstract. In [9], implicit generalizations over some Herbrand universe 
H were introduced as constructs of the form I = t/t\ V .. .V tm with the 
intended meaning that I represents all R-ground instances of t that are 
not instances of any term ti on the right-hand side. More generally, we 
can also consider disjunctions T = 7i V . . . V 7n of implicit generaliza- 
tions, where I contains all ground terms from 77 that are contained in at 
least one of the Implicit generalizations Ij . Implicit generalizations have 
applications to many areas of Computer Science. For the actual work, 
the so-called finite explicit representability problem plays an important 
role, i.e.: Given a disjunction of implicit generalizations T = 7i V . . . V 7„, 
do there exist terms ri, ... ,ri, s.t. the ground terms represented by T 
coincide with the union of the 77-ground instances of the terms rj? In 
this paper, we prove the coNP-completeness of this decision problem. 



1 Introduction 

In [9], implicit generalizations over a Herbrand universe H were introduced as 
constructs of the form 7 = 7/ti V . . . V 7^ with the intended meaning that 7 
represents all 77-ground instances of t that are not instances of any term ti 
on the right-hand side. The usefulness of implicit generalizations mainly comes 
from their increased expressive power w.r.t. explicit generalizations, which are 
defined as disjunctions of terms E — ri V .. .W ri, s.t. E represents all ground 
terms s € El that are instances of some rj . In particular, implicit generalizations 
allow us to finitely represent certain sets of ground terms, which have no finite 
representation via explicit generalizations. For the actual work with implicit 
generalizations, two decision problems have to be solved, namely: The emptiness 
problem (i.e.: Does a given implicit generalization 7 = t/ti V ... V 7^ contain 
no ground term s G HI) and the finite explicit representability problem (i.e.: 
Given an implicit generalization I = 7/7i V . . . V tm, does there exist an explicit 
generalization E — ri \/ .. .\/ ri, s.t. I and E represent the same set of ground 
terms in 77?). More generally, we can also consider disjunctions X = 7i V . . . V 7„ 
of implicit generalizations, where X contains all ground terms from H that are 
contained in at least one of the implicit generalizations Ij . 

Implicit generalizations have applications to many areas of Computer Sci- 
ence like machine learning, unification, specification of abstract data types, logic 
programming, functional programming, etc. A good overview is given in [8]. 
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Both the emptiness problem and the finite explicit representability problem 
have been shown to be coNP-complete in case of a single implicit generalization 
(cf. [6], [7], [5] and [11], respectively). Extending the coNP-completeness (and, in 
particular, the coNP-membership) of the emptiness problem to disjunctions of 
implicit generalizations is trivial, i.e.: X = /iV. . .V/„ is empty, iff all the disjuncts 
Ij are empty. On the other hand, the finite explicit representability problem 
for disjunctions of implicit generalizations is a bit more tricky. In particular, 
X = 7i V . . . V /„ may well have a finite explicit representation, even though 
some of the disjuncts Ij possibly do not, e.g.: The implicit generalization Ii = 
f{x, y)/ f{x, x) over the Herbrand universe 77 with signature U = {a, /} has no 
finite explicit representation, while the disjunction X = 7i V 72 with I 2 = f{x, x) 
is clearly equivalent to E = f(x,y). In this paper, we present a new algorithm 
for the finite explicit representability problem, which will allow us to prove the 
coNP-completeness also in case of disjunctions of implicit generalizations. 

This paper is structured as follows: In Sect. 2, we review some basic notions. 
An alternative algorithm for the finite explicit representability problem of a 
single implicit generalization is presented in Sect. 3, which will be extended to 
disjunctions of implicit generalizations in Sect. 4. In Sect. 5, this algorithm will 
be slightly modified in order to prove the coNP-completeness of the finite explicit 
representability problem. A comparison with related works will be provided in 
Sect. 6. Finally, in Sect. 7, we give a short conclusion. 



2 Preliminaries 

Throughout this paper, we only consider the case of an infinite Herbrand universe 
(i.e.: a universe with a finite signature that contains at least one proper function 
symbol), since otherwise the finite explicit representability problem is trivial. The 
property of terms which ultimately decides whether an implicit generalization 
actually has an equivalent explicit representation is the so-called “linearity”, i.e.: 
We say that a term t (or a tuple t of terms) is linear, iff every variable in t (or 
in t, respectively) occurs at most once. Otherwise t is called non-linear. Let x 
denote the vector of variables that occur in some term t. Then we call an instance 
td of t a, linear instance oft (or linear w.r.t. t), iff xd is linear. Otherwise tf) is 
called non-linear w.r.t. t. In general, the range of a substitution denotes a set 
of terms. However, in this paper, we usually have to deal with substitutions in 
the context of instances M of another term t. It is therefore more convenient to 
consider the range of a substitution as a uector of terms, namely: Let x denote 
the vector of variables in some term t. Then we refer to the vector xd as the 
range of f). In particular, we can then say that an instance tf) of t is non-linear, 
iff there exists a multiply occurring variable in the range of f}. 

W.l.o.g. we can assume that all terms ti on the right-hand side of an im- 
plicit generalization 7 = t/ti V . . . V tm, are instances of t, since otherwise 
we would replace ti by the most general instance mgi{t,ti). If all the terms 
ti are linear w.r.t. t, then an equivalent explicit representation can be imme- 
diately obtained via the complement representation of linear instances from 
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[9], i.e.: Let Pi = {pn, ■ ■ ■ ,PiMi} be a set of pairwise variable disjoint terms, 
s.t. every iL-ground instance of t that is not an instance of Wt is contained in 
some term pi^ and vice versa. Then / is equivalent to the explicit generalization 
E = V “'=1 ■ ■ • V ^=1 • • -,PmaJ (cf- [12], Corollary 3.5). 

The implicit generalization / = t/M can be considered as the complement of 
I'd w.r.t. t, i.e.: / contains all iL-ground instances of t that are not instances of 
I'd. In [9], an algorithm is presented which computes an explicit representation 
of the complement of W w.r.t. t, if W is a linear instance of t. However, for our 
purposes, we need an appropriate representation of the complement I'd w.r.t. t 
also in case of a non-linear instance W. In [4], such a representation is given 
via terms with equational constraints. Recall that a constrained term over some 
Herbrand universe H is a. pair [t : X] consisting of a term t and an equational 
formula X, s.t. an iL-ground instance ta of t is also an instance of \t : X], iff 
cr is a solution of X. Moreover, a term t can be considered as a constrained 
term by adding the trivially true formula T as a constraint, i.e.: t and \t : T] are 
equivalent. Then the complement of I'd w.r.t. t can be constructed in the following 
way: Consider the tree representation of d, ’’deviate” from this representation 
at some node and close all other branches of a as early as possible with new, 
pairwise distinct variables. Depending on the label of a node, this deviation can 
be done in two different ways: If a node is labelled by a constant or function 
symbol, then this node has to be labelled by a different constant or function 
symbol. If a node is labelled by a variable which also occurs at some other 
position, then the two occurrences of this variable have to be replaced by two 
fresh variables x, y and the constraint x ^ y has to be added. However, if a node 
is labelled by a variable which occurs nowhere else, then no deviation at all is 
possible at this node. We only need the following properties of this construction 
(for a proof of these properties, see [4] ) : 

Theorem 2.1. (Complement of a Term) Let t be a term over the Herbrand 
universe H and letM be an instance oft. Then there exists a set of constrained 
terms P = {[pi : Xi ], . . . , [p„ : Xn]} with the following properties: 

1. t/td = [pi : Xi] V ... V [pn : Xn], i.e.: Every H-ground instance of the 
complement of td w.r.t. t is contained in some [pi : Xi] and vice versa. 

2. For every i G {1, . . . , n}, Pi is a linear instance of t and Xi is either the 
trivially true formula T or a (quantifier- free) disequation. 

3. The size of every constrained term [pi : Xi] G P and the number n of such 
terms are polynomially bounded by the number of positions in td. 

Note that the equational constraints are only needed in order to finitely express 
the complement of a non-linear instance. Hence, the two approaches from [9] 
and [4] are quite similar, when only linear instances are considered. The repre- 
sentation of the complement of a term from Theorem 2.1 can be easily extended 
to implicit generalizations, i.e.: Let I = tjt\ V . . . V be an implicit generaliza- 
tion and let P = {[p\ : X \], . . . , [p„ : Xn]} be the complement of t w.r.t. some 
variable x, then P U {ti, . . . , tm} is a representation of the complement of I. By 
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writing the terms ti as constrained terms : T] with trivially true constraint, 
we end up again with a set of constrained terms. 

A conjunction of terms t\ A ... A tn over H represents the set of all terms 
s G H that are instances of every term ti. Suppose that the terms ti are 
pairwise variable disjoint. Then the set of ground terms contained in such a 
conjunction corresponds to the iJ-ground instances of the most general in- 
stance mgi{ti, . . .,tn). The most general unifier of terms ti, . . . , is denoted by 
mgu{ti, . . .,tn). When computing the intersection of the complement of various 
terms tiji, . . . , Mm, we shall have to deal with conjunctions of constrained terms 
of the form [pi : Ai] A . . .A [pm '■ Xm] , where the PiS are pairwise variable disjoint. 
Analogously to the case of terms without constraints, the set of ground terms 
contained in such a conjunction is equivalent to a single constrained term, which 
can be obtained via unification, namely: [pip : Zp\ with p — m.gu{pi, . . .,Pm) 
and Z = Ai A . . . A Xm- Hence, for testing whether the intersection of con- 
strained terms is empty, we have to check, whether p — mgu{pi, . . . ,Pm) exists 
and whether Zp has at least one solution. Since the constraint part Zp will al- 
ways be a conjunction of disequations, the latter condition holds, iff Zp contains 
no trivial disequation of the form t ^t. (For a proof see [1], Lemma 2). 

3 The Basic Algorithm 

In [9], it is shown that an implicit generalization / = t /Mi V .. .V Mm has an 
equivalent explicit representation, iff every non-linear instance Mi of t can be 
replaced by a finite number of linear instances of t (cf. [9], Proposition 4.7). A 
non-linear instance Mi of t which cannot be replaced by a finite number of linear 
ones will be referred to as “essentially non-linead' in this paper, e.g.: 

Example 3.1. (Essential Non-linearity) Let I = f{x,y)/f{x,x) be an implicit 
generalization over the Herbrand universe H with signature E = {/, a}. It is 
shown in [9], that I has no equivalent explicit representation. In particular, 
the non-linear term f{x, x) cannot be replaced by finitely many linear ones. 
In other words, f{x, x) is essentially non-linear. On the other hand, consider 
the implicit generalization J = f{yi,y 2 )/[f{x,x) V f{f{xi,X 2 ),xf)\. Then J' = 
/(j/i,y 2 )/[/(a, a) V /(/(a;i,a; 2 ),a; 3 )] is equivalent to J, i.e.: the term f{x,x) is 
inessentially non-linear, since it can be replaced by the linear term /(a, a). 

In this section, we provide a new decision procedure for the finite explicit repre- 
sentability problem of a single implicit generalization by formalizing the notion 
of “essential non-linearity”. To this end, we identify in Theorem 3.1 a disjunction 
of terms Si V . . .V s„, by which a non-linear term Mi on the right-hand side of an 
implicit generalization / = t/M\ V . . .VMm may be replaced. In Theorem 3.2, we 
shall then show that / has no finite explicit representation, if at least one of these 
new terms Sj is non-linear w.r.t. t. The idea of the replacement step in Theorem 
3.1 is that we may restrict the term Mi on the right-hand side of / to those in- 
stances, which are in the complement of the remaining terms Mj with j ^ i. The 
correctness of this step corresponds to the equality A—[BUC] = A—[{B—C)UC], 
which holds for arbitrary sets A, B and C . 
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Theorem 3 . 1 . (Replacing Non-linear Terms) Let I = t/Wi V ... V Wm be 

an implicit generalization and suppose that the term Wi is non-linear w.r.t. t. 
Moreover, let Pi = {[pn : Xu], [piM, '■ XiM,]} represent the complement of 
t§i w.r.t. t according to Theorem 2.1 and let all terms in {tdi\yj{pia, \ 2 < i < m 
and 1 < Oi < Mi} be pairwise variable disjoint. Then I is equivalent to the 
implicit generalization I' = t/td2 V ... V tdm V V^eM with 

M = {fj,\ 3 a 2 . . . 3 am, s.t. fJ, = m.gu{t'di,p2a2, ■ ■ ■ ,Pma„,) and 

{X2oc2 a ... a Xmam)P contains no trivial dis equation} . 

Proof. All terms Mi/i with p, € M are instances of Hence, I Q T trivially 
holds. We only have to prove the opposite subset relation: Let u G i.e.: u is 
an instance of t but not of t'&2 V ... V tdm nor of any term tdip. We have to 
show that then u is not an instance of tdi. Suppose on the contrary that u is 
an instance of tDi. Moreover, by assumption, u is not an instance of any term 
Wi with i > 2 . But for every i, Pi = {[pn : Xu], ..., [piM, ■ XiM,]} completely 
covers the complement of ttli w.r.t. t. Hence, there exist indices a2, . . . , am, s.t. 
u G tdi A [p2a2 ■ ^202] ■ ■ -A : Xma^,]. By the considerations from Sect. 2 , 

this conjunction is equivalent to [ttlip : Zp] with p = mgu(f'di,p2a2, ■ ■ ■,Pmara) 
and Z = X2012 A ... A Xma„, - Moreover, [tDip : Zp] C tdip holds. But then u is 
not an instance of T , which contradicts our original assumption. □ 



Theorem 3 . 2 . (Essential Non-linearity) Let I = t/Wi V ... V Mm be an 

implicit generalization. Moreover, let Pi = {[pn : Xu], . . ., [piM, ■ XiM,]} repre- 
sent the complement of Mi w.r.t. t according to Theorem 2.1 and let all terms 
in It'd!} U {pia, I 2 < z < m and 1 < Oz < Mi} be pairwise variable disjoint. Fi- 
nally, suppose that the term Mi is non-linear w.r.t. t and that there exist indices 
a2, . . ., am with 1 < oz < Mi for all i, s.t. the following conditions hold: 

1 . p = m.gu{Mi,p2a2, ■ ■ ■ ,Pma„,) cxists. 

2 . Zp with Z = X2012 A ... A Xma^ contains no trivial disequation. 

3 . There exists a multiply occurring variable y in the range of d\, s.t. yp is a 
non-ground term. 

Then I has no finite explicit representation. 

Proof, (indirect) For our proof, we have to make use of certain terms Cq{D), 
Ci{D), . . . from [ 9 ], which are defined as follows: Let / be a function symbol 
in the signature of H with arity q and let a be a constant. Fd{Ii, • • • , Iqo) de- 
notes the term whose tree representation has the label / at all nodes down to 
depth D — 1 and whose leaf nodes at depth D are labelled with li through 
IqD. Gd corresponds to the special case where all leaves h are labelled with 
the constant a, i.e.: Gd — Foia, . . .,a), e.g.: For a binary function symbol /, 
G2 = /(/(«) a), f(a, a)) holds. Then, for every z > 0 , the term Ci(D) is defined as 
Gi{D) = FD{GDy.{qOy.i+i),GDy.{qOy.i+2), ■ ■ ■ ,GDy.qCy.{i+i))- These terms have 
the following property: Lf there exists an index i, s.t. Ci{D) is an instance of a 
term t of depth smaller than D, then t contains no multiple variable occurrences 
and Cj{D) is also an instance oft for every index j. (For any details, see [ 9 ]). 




192 



Reinhard Pichler 



Now let j//i = f[z], where f[z] is some term containing the variable We 
modify -di to by replacing one occurrence of y in the range of -di by a fresh 
variable y' and extend y to y' , s.t. y' y' = f[z'] for another fresh variable z' . Now 
suppose that I is equivalent to the explicit generalization E = ri V . . . V Tm- 
Then we derive a contradiction in the following way: 

1. M^y' is an instance of P2a2'^ with tt = mgu{p2a2^ ■ ■ ■ By the defini- 

tion of the complement from Theorem 2 . 1 , all terms pic^ are linear w.r.t. t 
and, therefore, also p2o2'^ is linear w.r.t. t. Hence, p2o2'^ = for some sub- 
stitution (p which has no multiple variable occurrences in its range. Moreover, 
tdiy is an instance of [^202’’’ • with Z = X2012 A ... A X^a^- Thus, there 
exists a substitution A, s.t. diy — pX. Now consider the occurrence of f[z] in 
di/i which is replaced by f[z'] in ^[y' . There must be some variable x in the 
range of p which is instantiated by A to some term containing this occurrence 
of the variable 0. But then, since all variables in the range of (f occur only once, 
A can be modified to A' s.t. this occurrence of 2: in xX is replaced by z' and A' 
coincides with A everywhere else. Thus pX' = 'd[y' with A = A' o {2' ^ z} holds. 
Hence, in particular, td[y' is an instance of ^202’’' with P2a2^^' = t^[y' and 
P2o2^^' ° ^ 2} = tdiy. Moreover, since the equational problems Xic^ only 

contain variables from Pia, the equivalence Zy = ZttX' o {z' <— z} also holds. 

2 . Construction of two ground terms s' and s, s.t. s' is in / and s is outside: By 

assumption, Zy = ZttX' o {2' <— z} contains no trivial disequation. Hence, ZttX' 
contains no trivial disequation either and, therefore there exists a solution a' 
of ZttX' , i.e.: ZttX a' is yet another conjunction of disequations with no trivial 
disequation. Now let D be an integer, s.t. D is greater than the depth of any 
term occurring in ZttX' a' and greater than the depth of mgi{M'iy' ,r-^) for all 
terms r-y from the explicit representation E of /. Then we can modify a' to r', 
s.t. both substitutions coincide on all variables except for z' and z, where we 
define zt' = Co{D) and z't' = Ci{D). By the definition of Co{D) and Ci{D), no 
trivial disequation can be introduced into ZttX't' by this transformation from 
ZttX' a' . Let the term s' be defined as s' = P2a2'^^'T' . Then, on the one hand, 
s' is an instance of [^202’’’ • on the other hand, s' is not an instance of 

tdiy = P2 o 2^^' ^ z}, since r' assigns different values to 2: and z' . However, 

if we define the substitution r in such a way, that it instantiates both variables 
2: and z' to Co{D) and r coincides with a' everywhere else, then s = P2a2'^XT is 
indeed an instance of = P2a2'^X o {2:' <— z\ and, in particular, of Mi. Thus, 
s' is an instance of / = tjtdi V ... V Wm whereas s is not. 

3 . If s' is an instance of Xy, then s is also an instance of Xy-. By assumption, / is 
equivalent to E — xi\J . . -Vxm- Hence, there exists a term Xy, s.t. s' is an instance 
of Xy and, therefore, also of mgi{td'iy' , Xy). Thus, there exist substitutions p' and 
77', s.t. t-d'iy' p' = Tngi{td'iy' ,Xy) and W'ly'p'x]' = s'. By construction, Ci{D) is 
an instance of z' p' and the term depth of z' p' is smaller than D. Moreover, all 
subterms of C\{D) with root at depth smaller than D occur nowhere else in the 
range of d'^y' p'rj' . Hence, the variables in z' p' occur nowhere else in the range 
of p' and, by the properties of the terms Ci{D) recalled above, Co{D) is also 
an instance of z'p' . We can thus modify p' to p, s.t. z'p'p = Co{D) holds and 
p coincides with p' on all variables not occurring in z' p' . Thus, in particular. 
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= s holds. But then, s is an instance of t-d'ip' p' = ,Tj) and, 

therefore, also of r-y, which contradicts the assumption that is a disjunct from 
the explicit representation E oi I. □ 

By combining the two theorems above, we immediately get a new algorithm for 
deciding the finite explicit representability problem of a single implicit general- 
ization, namely: Let / = t/MiV. . .WMm be an implicit generalization. If all terms 
Wi are linear w.r.t. t, then / clearly has an equivalent explicit representation, 
which can be easily obtained via the complement representation of linear terms 
from [ 9 ] (cf. Sect. 2 ). On the other hand, if one of the terms Mi is a non-linear 
instance of t, then we may assume w.l.o.g. that i = 1 holds. Hence, we can either 
apply Theorem 3.1 or Theorem 3 . 2 . In the former case, the non-linear instance 
Mi of t is replaced by a disjunction of linear instances. In the latter case, it has 
been shown that no equivalent explicit representation exists. The termination of 
this algorithm is obvious, since the number of non-linear instances of t on the 
right-hand side of / is strictly decreased, whenever we apply Theorem 3 . 1 . This 
algorithm is put to work in the following example: 

Example 3 . 2 . Let I = t/M\ V ... V M^ be an implicit generalization over the 
Herbrand universe H with signature S = {a, /, g}, s.t. t = f{f{xi , 0:2), Xa)) and 

'&1 = {xi ^ f{yii,yi2),x2 ^ yi3, xs ^ yis} 

'&2 = {a;i ^ y2i,x2 ^ y2i,xs ^ 3(1/22)} 

= {a;i ^ y 3 i,x 2 ^ / (332, 333), 3:3 ^ 334} 

34 = {xi ^ 341 , a;2 ^ 3(342), 3:3 ^ 343} 

Note that every instance of t is of the form ta with a = {xi ^ Si,X2 <— 
S2,X3 <— S3}. In order to keep the notation simple, we denote such an instance 
by t(si, S2, S3). Then the complement of the terms M2, M3 and M4, respectively, 
is represented by the following sets: 

P2 = {[t{z2i,Z22,a) : T], [t{z2i, Z22, f{z 23 , Z24)) ■■ T], [t{z2i, Z22, Z23) : ^21 Z22]} 

P3 = {[t{z 3 i,a,Z 32 ) : T], [t{z3i, g{z 32 ), Z33) : T]} 

P4 = {[t{z 4 i,a,Z 42 ) : T], [1(241, /(^:42, ^^43), ^^44) : T]} 

in order to apply Theorem 3 . 1 , we have to compute the set M of certain unifiers. 
We shall write to denote the mgu of M\ with the terms 3202, 33 q 3 and 

P4a^ from the sets P2, P3 and P4. Then M according to Theorem 3.1 consists 
of a single element, namely /i(i,i,i) = 3(3, 1,1) = {313 ^ a}. Hence, / may be 
transformed into I' = t/M2 V M3 V M4 V t(/(3n , 312), a, a). 

We already know the representations P2 = P3 and P3 = P4 of the complement of 
M3 and M4. Moreover, we need the complement P} of t(/( 3 ii , 312), a, a), namely: 

P} = {[t(a,25i,252) : T], [1(3(251), 252, 253) : T], [1(251, 7(252, 253), 254) : T], 

[t(25i,3(252),253) : T], [t(Z5i, Z52, f(Z53, Z54)) ■ T], [^(251, 252, 3(253)) : T]} 
Then M (for P) from Theorem 3.1 again consists of a single element only, namely: 
3(1. 1.1) = 3(1. 1,6) = {321 ^ a}- The non-linear instance M2 in /' may therefore be 
replaced by the linear instance ^23(1, 1,1) = t(a, a, 3(322)). We have thus trans- 
formed I into /" = t/t(33i, 7 ( 332 , 333), 334)Vt(34i, 3(342), 343)Vt(/(3ii, 312), a, a) 
Vt(a, a, 3(322)), which has only linear instances of t on the right-hand side. Hence, 
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/ has a finite explicit representation. In order to actually compute an equiva- 
lent explicit generalization, we need the representation of the complement of all 
terms on the right-hand side. P” = P3, P2 = Pi and P3' = P[ have already been 
computed. The complement of t(a, a, 3(1/22)) can be represented by 

^4 = {^(5(^51), ^52, ^53), ^(/(^51,^52),^53), ^(^51, 5(^52), ^53), 

5 f (-^52 ; -2^53) 5 -^54)? ^ 52 ', ^ 52 ', f (-2^53 5 -2^54))} 

Hence, the explicit representation P of / is of the form 

(VpieP"Pi) ^ (Vp2eP2'^'2) A (Vp3eP"^'3) A {\/ p^(zp^’ Pi) = 

VpieP" Vp2eP" VpaeP" ^ Pi^P'^{P^ Ap2 Ap3 AP4) 

By computing all possible conjunctions p\ Ap2 Ap3 Ap4 and deleting those terms 
which are a proper instance of another conjunction, we get 

E = t{a,a,a)V t(/(j/i , 32), a, 3(33)) V t(3i, a, 7(32, 33)) V 43(34,0,32) = /(/(o, 
a)>a))V /(/(/(3i,32),a)>3(33))) V /(/(31 , a)> /(32, 33)) V f{f{g{yi),a),y2)) 

4 Disjunctions of Implicit Generalizations 

We shall now extend the notion of essential non-linearity to disjunctions of im- 
plicit generalizations J = 7 iV. . .V/„ with 4 = tijtnV . . Ntirm for i G {1 , . . . , n}. 
To this end, we again provide a disjunction of terms Si V ... V s/, by which a 
non-linear term t-dij on the right-hand side of an implicit generalization 4 may 
be replaced. Moreover, we claim that X has no finite explicit representation, if 
at least one of these new terms Sk is non-linear w.r.t. p. The idea of the replace- 
ment step in Theorem 4.1 is twofold: On the one hand, we may restrict the term 
tidij to those instances, which are in the complement of the remaining terms 
tidik on the right-hand side of 4 with k ^ j. On the other hand, we may further 
restrict the term ti'&ij to those instances, which are in the complement of the 
other Ik’s with k ^ i. The correctness of this replacement step is due to the 
equalities 4 -[BUC] = A-[{B-C)UC] and [A-B]UC = [A-{B-C)]UC, 
respectively, which hold for arbitrary sets A, B and C. 

Theorem 4.1. (Replacing Non-linear Terms) Let X = 7i V . . . V with 
~ V ... V tj’djrrij for j G {1 , . . . , n} be a disjunction of implicit gen- 

eralizations. Suppose that the term is non-linear w.r.t. t\ and let y denote 
the vector of variables that occur more than once in the range of Du. Moreover, 
let Pi — {[pii : Xii ], . . ., [piMi '■ XiMi]} represent the complement oftiDu w.r.t. 
ti according to Theorem 2.1 and let Qj = {[3^1 : Y,i], . . ., [qjNj ■ I'opr'e- 

sent the complement of Ij. Finally, let all terms in U {pia, | 2 < z < mi 

and I < at < Mi} U {qj/Sj \2 < j < n and I < ( 3 j < Nj] be pairwise variable 
disjoint. Then X is equivalent to X' = 7( V 72 V . . . V 7„, where I[ is defined as 

7( = ti/tii?i2 V . . . V Mimi V \J tiDiipi with 

peM 

M = {3|3a2...3a„ii3/?2...3/3„, s.t. v = m.gu{tiDn,p2a2, ■ ■ ■ ,Pma,^^,q2f32^ ■ ■ ■ 

■ ■ ■ ) gnfsj, (X2a2 A ... A A ¥2/3^ A ... A T„/3„)p contains no 

trivial disequation and y = n\y}. 
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Proof. All terms with /i S M are instances of ti’ffn. Hence, Ii C I[ 

and, therefore, also I <ZI' trivially holds. So we only have to prove the opposite 
subset relation: Let u G J'. If u G /2 V . . .V/„, then u is of course also contained in 

I. Thus, the only interesting case to consider is that u G and u ^ /2 V . . .V 
Hence, in particular, u is in the complement of every tidu with 2 < i < mi 
and in the complement of every Ij with 2 < j < n. But for every i, Pi = 
{[pii : Ail], • • - 5 [PiMi '■ XiMiW completely covers the complement of ti^u w.r.t. 
t\. Likewise, for every j, Qj = {[qji ■ Y,i], . . ., [qjNj ■ completely covers 

the complement of Ij. Hence, there exist indices 02 , ■ ■ ■ , otmi and /I 2 , ■ ■ ■ , I3n, s.t. 
U ^ Kj=2\^30j ' AjaA Then, analogously to the proof of 

Theorem 3.1, it can be shown that if u is not an instance of any term tiDup. 
on the right-hand side of /(, then u is not an instance of tiDu either. In other 
words, if u is an instance of I[ but not of any A with i >2, then u is actually 
an instance of A. □ 



Theorem 4.2. (Essential Non-linearity) Let I = A V ... V /„ with Ij = 
tj/tj§ji V ... V tjiljmj for j € {I, . . . ,n} be a disjunction of implicit generaliza- 
tions. Moreover, let Pi = {[pn : Xu], . . .,[piMi ■ XiMi]} represent the comple- 
ment oftidii w.r.t. A and let Qj = {[qji : Y,i], . . ., [qjNj '■ ’^jNjW represent the 
complement of Ij. Let all terms in {A^?ii}U{piQ-^ I 2 < z < mi and 1 < Oi < Mi} 
U {qjPj \ 2 < j < n and 1 < Pj < Nj} be pairwise variable disjoint. Now suppose 
that the termtiDu is non-linear w.r.t. A and lety denote the vector of variables 
that occur more than once in the range of Du. Finally suppose that there exist 
indices 02 , • • • , otmi with 1 < Oi < Mi for all i, and P 2 , ■ ■ ■, Pn with 1 < Pj < Nj 
for all j, s.t. the following conditions hold: 

1.0 = mgu{tiDii,P 2 a 2 , ■ ■ ■,Pma,r^j : 92/32, • • • , 9n/3„) BXists. 

2. Zv with Z = X 2012 A ... A Xm^ar^j A T 2/32 A ... A Tn/ 3 „ contains no trivial 
disequation. 

3. There exists a variable y in y s.t. yv is a non-ground term. 

Then I has no finite explicit representation. 

Proof. (Rough Sketch): Similarly to the proof of Theorem 3.2, we can use the 
terms Cq{D) and Ci{D) in order to construct ground terms s and s', s.t. s' is 
inside X and s is outside. Now suppose that X has a finite explicit representation 
ri V . . . V T/. Then, in particular, s' is an instance of some rj. Analogously to the 
proof of Theorem 3.2, we can derive a contradiction by showing that then also 
s is an instance of rj. For any details, see [14]. □ 

5 coNP-Completeness 

In [11], the coNP-completeness of the finite explicit representability problem of 
a single implicit generalization was proven. In this section we show the coNP- 
membership (and, hence, the coNP-completeness) also in case of disjunctions 
of implicit generalizations. In Theorem 4.2 we have provided a criterion for 
checking that a given disjunction of implicit generalizations has no finite ex- 
plicit representation. It would be tempting to prove the coNP-membership via 
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a non-deterministic algorithm based on this criterion, i.e.: Guess a non-linear 
term ti'dij on the right-hand side of an implicit generalization li and check that 
the conditions from Theorem 4.2 are fulfilled. Unfortunately, our criterion from 
Theorem 4.2 is sufficient but not necessary as the following example illustrates: 

Example 5.1. Let h = /(/(a;i, X 2 ), /(xa, X 4 ))/[/(/(yi, j/i), /(j/ 2 , a)) V /(/(j/i, 
yi), /(a,y2))V/(yi,/(/(y2,y3),y4))] and /2 = /(xi,/(a;2,/(a;3,a;4))) be implicit 
generalizations over H with signature E = {a, /} and let X = /i V / 2 . If we apply 
Theorem 4.1 to either f{f{yi,yi),f{y 2 ,a)) or /(/(j/i , yi), /(a, ^ 2 )) on the right- 
hand side of Ii, then this term may be deleted. However, after the deletion of 
one non-linear term. Theorem 4.1 only allows us to restrict the other term to 
the non-linear term /(/(yi, yi), /(a, a)). But then it is actually possible to apply 
Theorem 4.2 thus detecting that X has no finite explicit representation. 

Rather than looking for a single term on the right-hand side of some implicit 
generalization Ii which fulfills the criterion from Theorem 4.2, we have to find 
a subset of the terms on the right-hand side of Ii for which the conditions from 
Theorem 4.2 hold simultaneously. We thus get the following criterion, which will 
then be used in Theorem 5.2 to prove the main result of this work. 

Theorem 5.1. Let X = R V . . . V /„ with Ii = ti/tidu V ... V tidimi he a 
disjunction of implicit generalizations, s.t. Pij = {[p(ij)p : . . ., [p{ij),Mij ■ 

represents the complement oftidij w.r.t. ti and Qk = {[qki ■ Yki], ■ ■ ■, 
[ikNk • YkNkW represents the complement of Ik. Moreover, we assume that all 

terms in [Ur=i Upi{^i^?ii}]u[Ur=i U74 Uf=UP(u).a}] U [ULi 

are pairwise variable disjoint. Then X has no finite explicit representation, iff: 

1. 3i G {1, . . . , n} and 3K C {1, . . . , rm} with K ij) 

2. For every j G ({1, . . . , rm} — K), there exists an index aj G {1, . . . , M^-} 

3. For every j G {1, . . . , i—l, i+1 , . . . , n}, there exists an index /3j G {1, . . . , Nj] 

s.t. the following conditions hold for all k G K: 

1. The terms in {U'dik} U {p(ij),aj \ j G ({1, • ■ ■ , - K)} U | j G {1, . . . 

. . . ,i — 1, i -I- 1, ... , n}} are unifiable, with mgu Vk say. 

2. Zvk with Z = [Aj6({l,...,mi}-i<') ^ [A 46 {l,...y-l,i+l,...,ra} 

tains no trivial disequation. 

3. There exists a multiply occurring variable yk in the range of dik, s.t. yk^k is 
a non-ground term. 

Proof. (Sketch): For the “if”-direction, we may assume w.l.o.g. that i = 1 and 
K = holds for some kG {l,...,mi}. Furthermore, suppose that X has 

a finite explicit representation ri V . . .Vrj. Similarly to the proof of Theorem 3.2, 
we can construct terms Si and s.t. these terms are instances of ti but they 
are neither contained in /2 V . . . V /„ nor in V ... V Udimi- Moreover, 

Si is an instance of Udn, while s'l is not. If s'l is not an instance of any term 
tidik with 1 < k < K, then s'l is in fact an instance of Ii and therefore of X, 
while Si is not. Of course, there is no guarantee that s'l is not an instance of the 
terms tidi 2 , ■ ■ ■ , tidi^. But then, similarly to the proof of Proposition 4.6 in [9], 
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we can iteratively construct terms Sk and for fc G {1, . . . , k}, s.t. is not an 
instance of the terms ti'dn, . . while Sk is an instance of ti'dik- Hence, 

eventually we shall indeed get terms Sk and s.t. is an instance of Ii and 

therefore of X, while Sk is not. In particular, is an instance of some r-^. But 

then, analogously to the proof of Theorem 3.2, we can derive a contradiction by 
showing that also Sk is an instance of r-^. The details are worked out in [14]. 

In order to prove also the “only if”-direction, we provide an algorithm which, 
for every i G {1, . . . , n}, inspects all possible triples {K, A, B) with the following 
properties: K C rm} with K ^ A contains an index aj G {!,..., } 

for every j G ({1, . . . , rm} — K) and B contains an index j3j G {!,..., Nj} for 
every j G l,i+l,...,n}. Moreover, this algorithm computes sets TZi 

of instances of the terms tidij, s.t. all terms r G TZi are linear w.r.t. ti'. 

Let i G n}. Moreover, suppose that there exists k G {!,..., rrii}, s.t. the 

terms ti'&ij on the right-hand side of Ii are arranged in such a way that the terms 
ti'dii , . . . , ti'diK are non-linear instances of ti and the terms . . . , U'dimi 

are linear w.r.t. ti. Then we start off with (7^,72.^), where the set TZi of linear 
instances of ti consists of the terms {tittn^K+i), ■ ■ - and % denotes the 

set of all possible triples {K, A, B) with K = {!,..., k}, A = {a^+i, ■ ■ ■ , arm} 
and B — }/?i , • ■ ■ , Pi—i , Pi+i , ■ ■ ■ , Pn}- 

end condition : If we encounter a triple {K, A, B) in 7}, s.t. for all k G K, the 
above three conditions of this theorem hold or if there is no triple at all left in 
Ti, then we may stop. 

shrinking K: Suppose that there is a triple {K, A, B) in 7} with 0 and that 
there is at least one element k G K, s.t. one of the conditions of this theorem 
does not hold for {K, A, B). Let k denote the minimum in K, for which at least 
one condition of the theorem is violated. Then we delete {K, A, B) from 7} and 
add to T all triples of the form {K' , A', B) with K' = K — {k} and A' = A\j{ak} 
for all possible values of ak G {1, . . .,Mik}. Furthermore, if Vk exists and Zvk 
contains no trivial disequation (i.e.: the first two conditions of this theorem hold 
for k, while the third one does not), then we add tittiktik to TZi, where /Zfc is the 
restriction of Vk to the variables occurring in the range of ttik- 

base case : All triples {K, A, B) in % with = 0 may be deleted. 

The termination of this algorithm is clear since, in every step, a triple {K, A, B) 
in 7) is either deleted or replaced by finitely many triples of the form {K' , A' ,B), 
where K' is a proper subset of K. Now suppose that this algorithm is applied to 
every i G {1, ..., nj. In Proposition 5.1 below, we show that either this algorithm 
eventually detects an index i G {1, . . . , nj together with a triple {K, A, B), s.t. for 
all k G K, the three conditions of this theorem hold or, for every i G {1, . . . , n}, 
the terms tittn V ... V tittimi on the right-hand side of T may be replaced by the 
terms in TZi. Note that all the terms in TZi are linear instances of ti, since our 
algorithm only adds those terms tittikt^k to TZi, for which the third condition of 
this theorem is violated. However, the replacement of the terms on the right- 
hand side of every Ii by linear instances of ti is impossible by assumption. But 
then we must eventually encounter an index i G {1, . . . , nj together with a triple 
{K, A, B) with the desired properties. □ 
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Proposition 5.1. The algorithm given in the proof of Theorem 5.1 is correct, 
i.e.: Suppose that for every index i G {1, . . .,n}, this algorithm stops without 
finding a triple {K, A, B), s.t. for all k & K the three eonditions of Theorem 5.1 
hold. Then, for every i G {1, . . . , n}, the terms tidu V ... V tidirm on the right- 
hand side of li may in fact be replaced by the terms in TZi, i.e.: LetX = /iV. . .V/„ 
and T' = I'lV .. .V I'„ with /' = ti/ VreT?. ^ o.re equivalent. 

Proof. By the construction of TZi, every term r G TZi is an instance of some term 
tidij. Hence, li C /' and, therefore, also TCI' clearly holds. In order to prove 
also the opposite subset relation, we choose an arbitrary s gT' and show that 
s gT also holds. W.l.o.g. we may assume that s is an instance of I[. Then, in 
particular, s is an instance of . If s is an instance of /2 V . . .V /„ then, of course, 
s gT holds and we are done. So suppose that s is in the complement of / 2 , . . . , 
In, i.e.: There exist indices P2, ■ ■ ■, (dn, s. t- s G A”= 2 fe/ 3 j : Note that by the 

construction of TZ\, all linear instances • • • , of t\ from the right- 

hand side of I\ also appear on the right-hand side of I[. By the condition s G I[ 
we thus know that s is in the complement of the terms 

i.e.: There exist indices a^+i , . . s.t. s G AjlA-i-i Now 

consider the triple {K, A, B) from the initial set %, where A and B consist 
exactly of those indices q;i(k+i), . . . , aimi and P2, ■ ■ ■, /3n, respectively, s.t. s is 
an instance of A 74+1 A AL2fe/3j : If ^ = 0, then 

K — 0 holds and all terms on the right-hand side of I\ are linear instances 
of t\. But then, s is indeed an instance of I\ since we only consider the case 
where s is an instance of t\ and s is in the complement of . . . , 

So let it' yf 0 and let k denote the minimum in K, for which at least one 
condition of Theorem 5.1 is violated. We claim that then s is not an instance of 
tidik. For suppose on the contrary that s is an instance of tidik. Then s is also 
contained in titiifc A AjlA-ei A”=2fe/3j ■ Hence, s is an 
instance of tidikP-k, where ytk is the restriction of the unifier Vk = rngu{{tidik}U 
{P{ij),aj I + 1 < i < ^i} U {qj0j I 2 < j < n} to the variables occurring in the 
range of dik. However, tidikPk is added to TZi by our algorithm and, therefore, 
tidikPk occurs on the right-hand side of /(. But then s is not an instance of /(, 
which is a contradiction. 

Thus s is in the complement of tidik and, therefore, there exists an index 
ttfc G {1, . . . , Mik}, s.t. s G [p(ik),ak ■ dZ(ik),aJ- Note that our algorithm actually 
adds the triple (K',A',B) to Ti, where K' = K — {k} and A' — A A {ofc} 
holds. By assumption, our algorithm never finds a triple for which all the three 
conditions from Theorem 5.1 hold. Hence, eventually we shall have to select this 
new triple {K' , A' , B) . But then we can repeat the same argument as above also 
for the triple {K' , A' , B)\ Let k' denote the minimum in K' , for which at least 
one condition of Theorem 5.1 is violated. We claim that then s is not an instance 
of tidik'. For suppose on the contrary that s is an instance of tidik' Then s is 
also in ti'difc' A AAj=K-i-i [P(ii),aj • ^(ij),aj\l^l\j=2Vl30j ■ Ij'ft]- 

Hence, s is an instance of tidik'Pk', where ytk' is defined as the restriction of 
Vk' = mgu{{tidik'}U {p(^ik),ck}i-l {P{ij),ai | k+1 < J < mi}U{qj0^ I 2 < j < n}) 
to the variables occurring in the range of dik'. This is again impossible, since 
tidik'Pk' is added to TZi by our algorithm and, therefore, tidik'Pk' occurs on the 
right-hand side of ![. But then s is not an instance of ![, which is a contradiction. 
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Note that by iterating this argument K-times, we can show that s is not 
an instance of any term ti'&ij with j G Moreover, recall that we 

consider the case where s is an instance of ti and s is in the complement of 
tit?i(K+i), • • • , Hence, s is an instance of Ii and, therefore, also of J. □ 

Theorem 5.2. (coNP-Completeness) The finite explicit representability pro- 
blem for disjunctions of implicit generalizations is coNP-complete. 

Proof. By the coNP-hardness result from [12], we only have to prove the coNP- 
membership. To this end, we consider the following algorithm for checking that 
T = 7iV. . .V/„ with Ii — tijti'dii'd . . ■'dti’dirm has no finite explicit representation: 

1. Guess i, K, A and B. 

2. Check for all k G K, that the conditions from Theorem 5.1 hold. 

This algorithm clearly works in non-deterministically polynomial time, provided 
that an efficient unification algorithm is used (cf. [13]). Moreover, the correctness 
of this algorithm follows immediately from Theorem 5.1. □ 

By the coNP-hardness of the finite explicit representability problem, we can 
hardly expect to find a decision procedure with a significantly better worst case 
complexity than a deterministic version of the NP-algorithm from Theorem 5.2 
above, i.e.: Test for all possible values of i, K, A = {a^- \jG ({1, . . . , mt} — K)} 
and B = {fii, . . /3i_i, /3i+i, . . . , whether the conditions from Theorem 5.1 
hold. However, for practical purposes, one will clearly prefer the algorithm from 
the proof of Theorem 5.1, which has a major advantage, namely: If the conditions 
from Theorem 5.1 do not hold for any value of i, K, A and B, then this algorithm 
actually computes for every i G {1, . . . , n} a set 72.^ of linear instances of the term 
ti, s.t. the terms on the right-hand side of may be replaced by the terms in 
TZi- By the considerations on the complement of linear terms from Sect. 2, it is 
then easy to convert the resulting implicit generalizations /' into explicit ones. 



6 Related Works 

A decision procedure for the finite explicit representability problem of a single 
implicit generalization I = t/tiV . . .V tm was first presented in [9]. Its basic idea 
is a division into subproblems via the complement of a linear instance ti of t on 
the right-hand side of I, i.e.: Let P = {p\, . . - ,Pm} represent the complement of 
ti w.r.t. t and suppose that all terms ti, pj and t are pairwise variable disjoint. 
Then / is equivalent to the disjunction V^i Ij = pj /mgi{ti,pj) V . . . V 

mgi{ti-i,pj)\/mgi(fi+i,pj)\/ . . .\/mgi{tn,Pj). The subproblems Ij thus produced 
have strictly fewer terms on the right-hand side. Hence, by applying this splitting 
step recursively to each subproblem, we either manage to remove all terms from 
the right-hand side of all subproblems or we eventually encounter a subproblem 
with only non-linear instances on the right-hand side. For the latter case, it has 
been shown in [9], that / has no finite explicit representation. 
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In Sect. 3 we have provided a new algorithm for deciding the finite explicit 
representability problem of a single implicit generalization. Note that this algo- 
rithm starts from the “opposite direction”, i.e.: Rather than removing the linear 
terms from the right-hand side until finally only non-linear ones are left, our 
algorithm tries to replace the non-linear terms by linear ones until it finally de- 
tects a non-linear term which cannot be replaced by linear ones. If an implicit 
generalization has only few non-linear terms on the right-hand side, then our 
algorithm from Sect. 3 may possibly be advantageous. However, in general, the 
algorithm from [9] will by far outperform our algorithm from Sect. 3 for the fol- 
lowing reason: Suppose that there are several non-linear terms on the right-hand 
side of an implicit generalization / and that we may replace one such non-linear 
term ti by linear ones ui,. . .,um via Theorem 3.1. This replacement step, in 
general, yields exponentially many terms u\, . . . ,um (w.r.t. the number m of 
terms and also w.r.t. the size of these terms). Suppose that we next apply The- 
orem 3.1 to another non-linear term on the right-hand side of / and that we 
may replace t 2 by the linear terms Ui, . . . , Vn- Then N is actually exponential 
w.r.t. M . But then N is doubly exponential w.r.t. the size of the original implicit 
generalization. Of course, having to restrict yet another non-linear term to 
the complement of the terms Vi, . . .,Vn makes the situation even worse, etc. 

In [15], the algorithm from [9] is extended to disjunctions of implicit gen- 
eralizations. This algorithm consists of two rewrite rules: One is exactly the 
splitting rule from [9] . The other one basically allows us to restrict the instances 
of a non-linear term tij on the right-hand side of an implicit generalization li 
to the complement of another implicit generalization /^ = tfc/tfci V . . . V tkmki 
provided that mgi(tij, tk) is a linear instance of tij. Hence, when a disjunction of 
implicit generalizations contains several disjuncts with non-linear terms on the 
right-hand side, then the algorithm of [15] also suffers from the above mentioned 
exponential blow-up whenever a term is restricted to the complement of terms 
which are themselves the result of such a restriction step. Of course, our algo- 
rithm from Sect. 4 has this problem as well. However, the algorithm from Sect. 5 
is clearly better than this, i.e.: The terms that we add to the sets TZi in the proof 
of Theorem 5.1 result from restricting a non-linear term tij on the right-hand 
side of li to the complement of some other terms tij' on the right-hand side of 
li and to the complement of the other implicit generalizations li' . However, we 
never restrict a term w.r.t. terms which are themselves the result of a previous 
restriction step. Recall that in [12] an upper bound on the time and space com- 
plexity of the algorithm from [9] is given, which is basically exponential w.r.t. 
the size of an input implicit generalization (cf. [12], Theorem 5.5). In fact, our 
algorithm from Sect. 5 also has an upper bound with a single exponentiality, 
no matter whether we apply this algorithm to a single implicit generalization 
or to a disjunction of implicit generalizations: In order to see this, recall from 
Theorem 2.1, that the number of constrained terms in the complement of a term 
W is quadratically bounded by the size of this term. Hence, also the number Mij 
of constrained terms in the representation of the complement of tit)ij w.r.t. ti as 
well as the number Nk of constrained terms in the complement representation of 
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Ik is quadratically bounded by the size of tiilij and Ik, respectively. But then the 
number of possible values of i, K, A and B, that our algorithm from Sect. 5 has 
to inspect, is clearly exponentially bounded in the size of the original disjunction 
X of implicit generalizations. 

In [15], the results on implicit generalizations of terms are extended to tuples 
of terms. Moreover, recall that the term tuples contained in a disjunction X = 
7i V . . . V of implicit generalizations with Ij = tj/tji V ... V tj^j can be 
represented by an equational formula of the form — tjAtji ^ 

tj A ... A tjmj ^ tj) with free variables in z, where Xj denotes the variables 
occurring in tj and yj denotes the variables occurring in tji V . . . V tjmy It 
is shown in [15], that only a slight extension of the transformation given in 
[10] is required so as to transform any equational formula into an equivalent 
one of this form. But then, an algorithm for the finite explicit representability 
problem of disjunctions of implicit generalizations immediately yields a decision 
procedure for the negation elimination problem of arbitrary equational formulae. 
In [3], a different decision procedure for the negation elimination problem is 
given by appropriately extending the reduction system from [2] . Even though this 
approach differs significantly from the method of [15], this algorithm also requires 
that certain terms have to be transformed w.r.t. terms that are themselves the 
result of a previous transformation step. Hence, analogously to the algorithm of 
[15], it does not seem as though there exists a singly exponential upper bound on 
the complexity of this algorithm even if it is only applied to equational formulae 
which correspond to a disjunction of implicit generalizations. 

Several publications deal with the complexity of the emptiness and the finite 
explicit representability problem. In [6], [7] and [5], the coNP-completeness of 
the emptiness problem was proven. The coNP-hardness of the emptiness prob- 
lem was also shown in [12], where the coNP-hardness of the finite explicit rep- 
resentability problem was then proven by reducing the emptiness problem to 
it. A coNP-membership proof of the finite explicit representability problem of a 
single implicit generalization was given in [11]. The coNP-membership in case of 
disjunctions of implicit generalizations has been an open question so far. 

7 Conclusion 

In this paper, we have revisited the finite explicit representability problem of im- 
plicit generalizations. We have provided an alternative algorithm for this decision 
problem which allowed us to prove the coNP-completeness in case of disjunctions 
of implicit generalizations. The most important aim for future research in this 
area is clearly the search for a more efficient algorithm. By our considerations 
from Sect. 6, our algorithm from Sect. 5 has a similar worst case complexity 
to the algorithm for single implicit generalizations from [9]. In contrast to the 
algorithms from [15] and [3], this upper bound on the complexity holds for our 
algorithm also in case of disjunctions of implicit generalizations. Hence, our al- 
gorithm can be seen as a step towards a more efficient algorithm for this decision 
problem, to which further steps should be added by future research. 
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Abstract. In 1936 Alonzo Church observed that the “word problem” 
for combinators is undecidable. He used his student Kleene’s representa- 
tion of partial recursive functions as lambda terms. This illustrates very 
well the point that “word problems” are good problems in the sense that 
a solution either way - decidable or undecidable - can give useful infor- 
mation. In particular, this undecidability proof shows us how to program 
arbitrary partial recursive functions as combinators. 

I never thought that this result was the end of the story for combinators. 
In particular, it leaves open the possibility that the unsolvable problem 
can be approximated by solvable ones. It also says nothing about word 
problems for interesting fragments i.e., sets of combinators not combina- 
torially complete. 

Perhaps the most famous subproblem is the problem for S terms. Re- 
cently, Waldmann has made significant progress on this problem. Prior, 
we solved the word problem for the Lark, a relative of S. Similar solutions 
can be given for the Owl (S*) and Turing’s bird U. Familiar decidable 
fragments include linear combinators and various sorts of typed combi- 
nators. Here we would like to consider several fragments of much greater 
scope. We shall present several theorems and an open problem. 



1 Introduction 

In 1936 Alonzo Church [3] observed that the “word problem” for combinators 
is undecidable. He used his student Kleene’s representation of partial recursive 
functions as lambda terms. This illustrates very well the point that “word prob- 
lems” are good problems in the sense that a solution either way - decidable or 
undecidable - can give useful information. In particular, this undecidability proof 
shows us how to program arbitrary partial recursive functions as combinators. 

I never thought that this result was the end of the story for combinators. 
In particular, it leaves open the possibility that the unsolvable problem can be 
approximated by solvable ones. It also says nothing about word problems for 
interesting fragments i.e., sets of combinators not combinatorially complete. 

Perhaps the most famous subproblem is the problem for S terms. Recently, 
Waldmann has made significant progress on this problem [12]. In [8] we solved 
the word problem for the Lark, a relative of S. Similar solutions can be given 
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for the Owl (S*) and Turing’s bird U [7]. Familiar decidable fragments include 
linear combinators and various sorts of typed combinators. Here we would like to 
consider several fragments of much greater scope. We shall present several the- 
orems and an open problem. We begin with a series of decidable word problems 
which approximate the undecidable general case. 

Proper combinators P are given by reduction rules 

Pa;(l) . . .x{p) X 

where X is an applicative combination of the indeterminates a;(l) . . . a;(p). P 
is said to be a compositor if X cannot be written without parentheses using 
the convention of association to the left. Equivalently, in the terminology of 
Craig’s conditions [9], P has compositive effect. Otherwise, P is said to be a 
co-compositor. Examples of co-compositors are 

Cxyz xzy (The Cardinal) 

Kxy X (The Kestrel) 

Ix ^ X (The Identity) 

Wxy ^ xyy (The Warbler) 

and the typical compositor is the Bluebird 

Bxyz x{yz). 



Suppose that we fix a finite set of compositors Q with rules 



Qx{l) . . ,x{q) X. 

Consider the set of all applicative combinations of arbitrary co-compositors and 
these finitely many compositors, and select finitely many closed instances 



QM{l)...M{q) 



~ M(1) M{q) ' 

_ a;(l) ’ ■ ■ ■’ x{q) _ 



X 



of the reduction rules for the compositors in this set. These finitely many in- 
stances together with the reduction rules for all co-compositors induce a re- 
ducibility relation > > and a congruence relation < < -^ > . 



Theorem: The relation is decidable. Indeed, when finitely many co-com- 
positors are selected in advance the problem is solvable in polynomial time. 



Next we consider a tantalizing open problem an several closed variants. 

In [9] proved that any basis of proper combinators must contain one of order 
> 2. This suggests that the word problem for all proper combinators of order 
< 3 might be decidable. 



Problem: Determine whether the word problem for all proper combinators of 
order < 3 is decidable. 
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Slightly less is decidable. P is said to be hereditarily of order one (HOO) if P 
has reduction rule 

Px^ X 

where X is a normal applicative combination of x and previously defined mem- 
bers of HOO. Examples are 

Ix ^ X (The Identity) 

Mx ^ XX (The Mockingbird) 

K*x I (The Identity once removed) 

C** xl (The Cardinal twice removed) 

Theorem: The word problem for HOO is decidable. Indeed, it is log-space 
complete for polynomial-time. 

Slightly more is undecidable. P is said to be hereditarily of order two (HOT) if 
P has reduction rule 

Pxy X 

where X is a normal applicative combination of x,y, and previously defined 
members of HOT. 

Examples: 

Lxy x{yy) (The Lark) 

Uxy ^ y{xxy) (Turing’s bird) 

Oxy y{xy) (The Owl) 

Theorem: The word problem for HOT is undecidable. Indeed every partial 
recursive function, under an appropriate encoding, can be represented in HOT. 
Below we omit most proofs except for one or two lines indicating the direction 
of proof. Proof will be present in the final version of the paper. 

2 Compositors and Co-compositors 

Again, proper combinators P are given by reduction lines 

Px{l) . . .x{p) ^ X (1) 

where A is a applicative combination of the indeterminates a;(l) . . .x{p). Evi- 
dently, P is a co-compositor if and only if there is an integer r and a function 
/ : [1, r] ^ [l,p] such that 

X = x{f {!))... x{f{r)). 

Let COCO be the set of all co-compositors. 
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Suppose that we fix a finite set CO of compositors Q with rules 



Qx{l) . . ,x{q) X. (2) 

Let COMB be the set of all applicative combinations of members of CO U COCO. 
Select finitely many closed instances 



QM{l)...M{q) 



'M(l) M{q)' 

_ a;(l) ’ ■ ■ ■’ x{q) _ 






(3) 



of the reduction rules for members of CO where M(l), . . M{q) members of 
COMB. These finitely many instances together with the reduction rules for all 
co-compositors induce a reducibility relation > > and a congruence relation 
. More precisely, is the monotone, transitive, reflexive closure of all COMB 
instances of (1) for P in COCO and the finitely many (3) for Q a member of 
CO. is the congruence closure of 

Now the reducibility relation > > need not be Church-Rosser since the in- 
stances (3) may not be closed under reducibility. For this reason we shall extend 
to a Church-Rosser reducibility which generates the same . We define 
reductions ^ (n) by induction on n as follows. 

Basis; n = 0. ^ (0) is — > and — (0) is 

Induction Step; n > 0. We define ^ (n) to be ^ (n — 1) together with all 
instances (3) such that there exist iV(l), . . . , N{q) satisfying 



(i) M{i) (n — l)N{i) for i == 1, . . ,,q 

(ii) QN{1) . ..N{q) ^ (n - 1) 



mi 

x{l) 



N{q) 

x(g) 



X. 



(n) is defined to be the monotone, transitive, reflexive closure of — > (n). 



Finally, we define ^ -I- to be the union of the — > (n) and > > -|- the union of 
the (n). Clearly the congruence closure of — -I- is just and — -I- 
is Church-Rosser. Given a finite > -I- reduction. 



M(l) ^ +M(2) ^ + . . . M(m - 1) ^ +M(m) (4) 

where each step consists of contracting a single ^ -I- redex we assign a triple of 
integers with coordinates 

(i) the least n such that (4) is an (n) reduction 

(ii) m = the length of (4) 

(iii) I M(l) 1= the length of M(l) 

ordered lexicographically. We refer to the triple as the ordinal of (4) . Evidently, 
the ordinal of (4) is less than omega cubed. 

We define the positive subterms of a term M as follows. If M = PM{\) . . . 
M (p) then the positive subterms of M consist of M and all the positive subterms 
of M(l) and . . . and M{p). 

We have the following: 

Fact 1: If M -|-A^ then every positive subterm of IV is a — s— > -I- reduct of a 
positive subterm of M or 




On the Word Problem for Combinators 



207 



(*) a positive subterm of one side or the other of one of the closed instances (3) 
for Q in CO. 

Moreover, the ordinals of these reductions are not larger than the ordinal of 
the reduction M — s— > +N. 

Proof: By induction on the ordinal of the reduction. 

Now suppose that M is given and define a set of identities 
AXIOM(M) as follows. AXIOM(M) contains all the identities 

(i) QM{1) . . ,M{q) = . . ., for the selected close instances of (3) 

for the members Q of CO. 

AXIOM(M) also contains all the identities 

(ii) PN{1) . . -N{p) = N{f{l)) . . . N{f{p)) for each member P of COCO which 
appears in either M or one of the M{j) in (i) and for each A^(l), . . . , N{p) 
which are positive subterms of either M or (*). 



Fact 2: If M +N then AXIOM(M) \- M = N 

Proof: By induction on the ordinal of the reduction M +N. 

We obtain the following: 

Proposition 1: M N if and only if AXIOM(M) U AXIOM(N) \- M = N 

Proof: Clearly AXIOM(M) U AXIOM(N) h M = N ^ M N. Now 

suppose that M N. By the Church-Rosser theorem there exists L such 

that M +L+ N. By FACT 2, AXIOM(M) h M = L and AXIOM(N) 

^ N = L. Thus, AXIOM(M) U AXIOM(N) ^ M = N. ■ 

Thus, we obtain the decidability proof by remarking that the problem of 
determining whether a finite list of closed identities implies a closed identity is 
decidable (the word problem for finitely presented algebras) [1]. 



3 Combinators Hereditarily of Order One 

Members of HOC are atoms with associated reduction rules. These reduction 
rules generate a notion of reducibility > > and a congruence < <- >> . HOO and ^ 
are defined simultaneously by induction as follows. 

If X is an applicative combination of x’s then H defined by the reduction 
rule Hx X belongs to HOO. 

If A is a > > normal combination of a;’s and previously defined members of 
HOO then H defined by the reduction rule Hx — > X belongs to HOO. 
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We use H, J, L to range over HOO combinators, and M, N for HOO com- 
binations below. We shall write Xx. X for the member of HOO with reduction 
rule ending in X. 

^ is a regular left normal combinatory reduction system [5], so it satisfies 
the Church-Rosser and Standardization theorems. Clearly, any normal HOO 
combination belongs to HOO. If M is a HOO combination with no normal form 
we write M = @. This makes sense since the corresponding lambda term is an 
order zero unsolvable. More generally it is easy to see that coincides with 
beta conversion of the corresponding lambda terms. 

We define the notion of @ normal form (@nf) as follows. 

M is in @nf if M = a HOO combinator H or M = HJM{1) . . .M{m) where 
HJ = @ and each M{i) for z = 1, . . . , m is in @nf. 

It is easy to see that @nf’s always exist but they are hardly unique. The following 
relation is useful for computing @nf’s. is the monotone closure of 

HM:^ [^]X if M = @ 
r LiiHJ 

i[f]Xif HJ = @ 

for H = XxX. 

The relation is actually decidable; more about this below. A simple induction 
shows 

Fact 1: M M>^ H. 

We need the following notation. We write M = . . . , M(m)] if M(l),- 

. . are disjoint occurrences of the corresponding HOO combinations in 

M. 

Lemma 1: If M = M[M(I), . . . , M(m)], for z = I, . . . , m we have M(z) 

H{i), and M ^ N then we can write N = N[N{1), . . .,N{m)] with , for 
i = I, . . . , m, N{i) J(z) and M[H{1 ), . . . , H{m)>^^ fV[J(I), . . . , J{m)\ 

Proof: By a case analysis of the redex position. 

From this we obtain the following 

Proposition 2: IF M — N and is @ normal then N. 

[ is the partial order on members of HOO defined by the cover relations 
J [ iJ if iJ = Ax J 

Ax.X(z) [ H for z = 1, . . . , zz if iJ = Ax.xA(l) . . .X{n). 

A sequence H_ = H{1) . . .H{t) of HOO combinators is said to be admissible if 
H_ is closed under [ and H{i) [ H{j) ^ i < j. Note that if H_ is admissible and 
H{i)H{j) H then iJ is a member of H_. 
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We define the n*n matrix MATRIX(H_) with entries in {1, . . ,,n, @} by 

r k if H{i)H{j) = H{k) 

MATRIX(H_)(t, j) = 

[ @ otherwise 

The procedure ( )@ is computed on H_ combinations as follows. = 

m 

(iJ(i)M(l) . ..M{m))@ = 

(H{j) M(2) . . . M{m))@ if H(i) = \x.H{j) 

or (M(l))@ = H{k) and 
H{i)H{k) ^H{j) 

H{i) H{j) (M(2))@ . . . {M{m))@ if (M(l))@ = H{j) and 

H{i) H{j) = @ 

A(M(2))@...(M(m))@ if M(l) = @ and 

H{i) = Ax. X with x in X. 

Although the output of ( )@ can be exponentially long in the input this is 
only because of repeated subterms. The procedure will run in time polynomial in 
the input and JI_ if the output is coded as a system of assignment statements. 
The relation i— > is defined to be the monotone closure of 

IIJ^ {[J/x] X)@ 

where H = Ax. X. is particularly useful in conversion between @ normal 
forms. Observe here that the congruence relation generated by i— > restricted to 
admissible H_ can be presented as a finitely presented algebra and so, as above, 
it is decidable by [1]. 

FACT 2: If M = @ then {MN)@ (N)@. 

FACT 3: If iJ = Ax. A and M = @ then {HM)@ ^^{[M/x] X) @. 

Lemma 2: If N then (M)@ {N)@. 

Proof: By induction on M. 

Proposition 3: If M and N are @ normal forms and M — N then 

M\ N. 

Corollary : If M and N are @ normal forms and M N then there exists a 
@ normal form R such that M !—»—!■ R ^^\N. 

We now suppose that MATRIX (H_) is given and we wish to compute MA- 
TRIX 

{H_H{n +1)). Toward this end we define a procedure $( ) which takes as an 
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input an H_H{n + 1) combination and depends on H_ and a parameter V, a 

subset of {1, . . . , n + 1} (8> {n + 1} (here we suppose that 

MATRIX (iJj has been supplemented with values for pairs not in R). 

Input: M 

If M = H{i) then return i else 
If M = H{i)M{l) . . .M{m) then do 

liH{i) = Ax. H{j) then return $ {H{j)M{2) . . .M{m)) else 
set h := $(M(1)) and 

If ft. = (ft, I) then return (ft, I) else if ft = ft then 
cases: (i, k) belongs to V return (i, k) 

i = n + 1 and ft < n + 1 let i?(n + 1) = Ax. X{n + 1) 
set ft := %{[H{k)/x]X(n + 1)) 
if ft = p then $ {H{p)M{2) . . .M{m)) else 
return ft 

(i, ft) does not belong to V 

ifMATRIX(ftf_)(t, ft) = pthen (H{p)M{2) . 
else return (i, ft). 

Note that if the values ${[H{k)/x]X{n + 1)) for ft = 1, . . . , n have been pre- 
computed and store for look up then the procedure $( ) runs in time polynomial 
in the input. $( ) computes a first approximation to the head of a @nf for the 
input. It is used as follows. For i = 1, . . . , n -I- 1 set ft(i) = $ {[H{n + l)/x] X(z)). 
Define a graph G{V) as follows. The points of G{V) are the values ft(i) and the 
pairs {i, n -I- 1) in V. The edges are directed 

{i, n+ 1) — > h{i). 

Given (i, n-l- 1) in V, {i, n-l- 1) begins a unique path which either cycles or termi- 
nates in a value outside of V. If this path cycles then H {i)H{n+l) @ as we 

shall see below. The path terminates in a pair (j, ft) only if MATRIX(i7_)(j, ft) = 
@ so again H{i)H{n + 1) = @. Finally, if the path terminates in an integer ft 
then the last edge in the path is 



(j, n+l) — > k 

and we can conclude H{j)H{n+l) H{k). Thus, at least one new value can 
be added to the matrix and V decreased by at least one. 

Lemma 3: If [H{j)/x] X{i) — H{i)H{j)M{l) . . . M{m) then = @ 

Proof: By the standardization theorem. 

Given admissible MATRIX (ftfj can be computed recursively from the 

initial segments of H_ in time polynomial in H_. 
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Now suppose that we are given two HOO combinations together with the 
reduction rules for their atoms. Construct an admissible H_ containing these 
atoms. 

This can be done in time polynomial in the input. Next compute 
MATRIX(iJ_) as above. Using MATRIX(i7_) compute (M)@ and {N)@ as sys- 
tems of assignment statements. Finally, add to these systems the equations 
H{i)H{j) = {[H{j)/x\ X(i)) for each pair H{i),H{j) in H_ (or rather the cor- 
responding systems of assignment statements), and, using the algorithm for the 
word problem for finitely presented algebras [4], test whether M = N is a con- 
sequence of these statements. In summary 

Proposition 4: The word problem for HOO combinations can be solved in 
polynomial time. 

It can be shown by encoding the circuit value problem that the word problem 
for HOO combinations is log space complete for polynomial time. 



4 Combinators Hereditarily of Order Two 

Members of HOT are atoms with associated reduction rules. These reduction 
rules generate a notion of reducibility > > and a congruence < <->> . HOT and — > 
are defined simultaneously by induction as follows. 

If X is an applicative combination of a;'’s and y'’s then H is defined by the 
reduction rule Hxy ^ X belongs to HOT. 

If X is a > > normal combination of x’s, y’s, and previously defined members 
of HOT then H defined by the reduction rule Hxy — > X belongs to HOT. 

We shall write \xy. X for the member of HOT with reduction rule ending in 
X. This notation induces a translation of HOT combinations into lambda terms. 
It will be convenient to refer to this translation below. In particular, we will say 
that the HOT combination M is strongly normal if it is in normal form and its 
translation has a beta normal form. 

It can be shown that HOT is not finitely generated and therefore HOT is not 
combinatorially complete. 

Simple data types can be encoded into HOT as follows. 

Booleans: 

T=K 
F = K* 

IMP = Xxy. xyT 
NEC = {Xxy. yFT)T 

Integers: 

0 = K*K* 

SUCC = Xxy. yF)T 
SUCC# = Xab. SUCC(a6) 
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n + 1 = Slice n (Barendregt numerals) 

PRED = {Xxy. yF)T 
PRED# = Xab. abTO(abF) 

ZERO = {Xxy. yT)T 
sg = {Xxy. yT01)T 
sg ~= {Xxy. yT10)T 

Primitive recursive functions can be encoded into strongly normal HOT com- 
binations as follows. 

-k = {Xuv. vOUU{Xxy. yTO {SVCC#{x{yF))))v)T 
-k# = {Xuv. uvOUU{Xxy. yTO {SVCC#{x{yF)))){uv)v 
- = {Xuv. vIUU{Xxy. yTO (PRED#(x(i/F))))t>)T 

* = {Xuv. vOUU{Xxy. yTO{+ij^{x{yF))))v)T 

so that we have 

+nm^ (SUCC")xm = SUCC("+™)0 = n + m 

f (PRED")to m— nif m-kl > n 

—nm — S A T 

0 if n > m 

*n m (-km)” 0 n*m. 

Using similar techniques and constructions one can find strongly normal HOT 
combinations SQ, ROOT, D, QUAD RES, PAIR and R such that 
D n m \ n — m \ 

SQ T n 

ROOT T n square root n rounded down 
QUAD RES T n n— (square root n rounded down)^ 

PAIR n m {{n + m)^ -k m)^ -k n 
RTn^ QUAERES T (ROOT T n). 

Note that R T (PAIR n m) < <--> > m and QUAD RES T (PAIR n m) < <■-> > n. 

Suppose now that M and N are strongly normal HOT combinations. 

Define 

SUM(M, N) = Xyx. a;0 -k {xO{Mx)){xONx))T 

COMP(M, N) = {Xyx. xOM{xONx)T 

IT(M) = {Xyx. xOUU{Xuv. vTO{yOM{u{vF))))x)T . 

Then 

S\JM{M,N)n^ +{Mn){Nn) 

COMP(M, N) n M{Nn) 

IT(M) n M”0. 

Thus by [6] page 93 we have verified that 

Proposition 5: Every primitive recursive unary function is representable by a 
HOT combination. 

Partial recursive functions can be represented by HOT combinations as fol- 
lows. Let t be the characteristic function of Kleene’s T predicate i.e., T{e, x, y) 
t(e, x,y) = 0 and define primitive recursive functions r and s by 
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r(x) = X— (square root x rounded down)^ 

s(x) = square root x rounded down — (square root (square root x rounded down) 
rounded down)^. 

The unary primitive recursive function t(e,r{x) s(a;)) is represented by a 
strongly normal HOT combination H{e). By [6] pages 83 and 84 we have 
H{e) PAIR(n, m) 

t{e, n, m). Now define 

G(e) = \vu. uIUU{Xxy. yIH{e)yT{yIRy){x{yI(PAlR){yILy) 
((j//(SUCC)(j/Ji?j/))))(uJ(PAIR)uO))T 

and we obtain G{e)n = min{m : t{e,n,m) = 0}. Thus, if f is a strongly nor- 
mal HOT combination representing Kleene’s result extracting function the term 
Xyx. xIF(xIG{ie)x))T is a strongly normal HOT combination which represents 
the partial recursive function {e}. Thus, we have 

Proposition 6: Every partial recursive function is representable by a strongly 
normal HOT combination. 
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Abstract . We propose an algebraic reconstruction of resolution as 
Knuth-Bendix completion. The basic idea is to model propositional or- 
dered Horn resolution and resolution as rewrite-based solutions to the 
uniform word problems for semilattices and distributive lattices. Comple- 
tion for non-symmetric transitive relations and a variant of symmetriza- 
tion normalize and simplify the presentation. The procedural content 
of resolution, its refutational completeness and redundancy elimination 
techniques thereby reduce to standard algebraic completion techniques. 
The reconstruction is analogous to that of the Buchberger algorithm by 
equational completion. 



1 Introduction 

Is it a mere coincidence that the resolution procedure from computational logic, 
the Knuth-Bendix procedure from universal algebra and the Buchberger al- 
gorithm from computer algebra look so similar? Already in the mid-eighties, 
Buchberger suggested presenting the three procedures as instances of an uni- 
versal algorithm type of completion [4]. In this spirit, the Buchberger algorithm 
has been reconstructed in terms of Knuth-Bendix completion [5]. But is the 
same program possible with resolution? Our main result is a positive answer 
to this question. But before further explanations, let us turn to the Buchberger 
algorithm. 

Logically, the Buchberger algorithm solves the uniform word problem for 
certain polynomial rings: Given a finite set P of equations (a presentation) and an 
equation s « t (over some set of constants) in the language of polynomial rings, 
determine whether every polynomial ring that satisfies P also satisfies s « t. 
For empty P, a canonical term rewrite system E for polynomial rings solves the 
problem. Otherwise, P and E are completed; critical pairs between E and P are 
eagerly determined. This symmetrization technique [9], which is not restricted 
to polynomial rings, normalizes and simplifies P. Arbitrary equations between 
polynomial ring expressions are rewritten as equations between polynomials. 
Gritical pairs of P (S'-polynomials) are deduced modulo these normalizations and 
certain redundant polynomials are deleted. P is thereby effectively transformed 
into a Grobner basis G , such that a polynomial equation po ~ 0 is a consequence 
of P when po can be rewritten to 0 by G. 

At first sight, ordered resolution may appear similar to Knuth-Bendix com- 
pletion: certain consequences (determined by resolution) are added to an initial 



L. Bachmair (Ed.): RTA 2000, LNCS 1833, pp. 214-228, 2000. 
@ Springer- Verlag Berlin Heidelberg 2000 




An Algebra of Resolution 215 



clause set, certain redundant clauses are deleted and an ordering on clauses trig- 
gers these operations. But the intended reconstruction also leads to questions 
that are less obvious to answer: What is the word problem solved by resolution? 
Is resolution rule a critical pair computation? Is there any symmetrization? Our 
main result shows that (ground ordered) Horn resolution and resolution solve the 
uniform word problem for semilattices and distributive lattices via a variant of 
Knuth-Bendix completion for non-symmetric transitive relations [17]. The res- 
olution rule is a lattice-theoretic critical pair computation and the initial clause 
set a presentation P in terms of inequalities. It is even equivalent to the dis- 
tributivity axiom. Arbitrary lattice expressions are normalized via the respec- 
tive non-symmetric counterparts of canonical systems [11] given by Levy and 
Agusti. Redundant inequalities are deleted in accordance with non-symmetric 
completion. Thereby P is effectively transformed into a resolution basis R that 
provides a certain normal form proof for precisely those inequalities that are 
consequences of P. As a special case, P is inconsistent or trivial iff 1 < 0 is a 
consequence of P iff 1 < 0 is in R. 

The algebraic reconstruction of resolution provides a refined and uniform 
procedural view: First, the completion-based ordering constraints for resolution 
are weaker than the standard ones, where only maximal atoms are resolved. 
Only for boolean lattices (in analogy to the Buchberger algorithm) , the stronger 
constraints also yield solutions to the uniform word problem. Otherwise, they 
are too restrictive for word problems, but still sufficient for deciding inconsis- 
tency. Second, we obtain refutational completeness of resolution as a corollary 
of correctness of non-symmetric completion. Finally, redundancy concepts for 
ordered resolution are inherited from non-symmetric completion. 

The remainder is organized as follows. Section 2 introduces some rewriting 
and lattice theory basics. Section 3 sketches non-symmetric completion as far as 
needed for solving our word problems. Sections 4 and 5 specialize ground non- 
symmetric completion to solutions of uniform word problems for semilattices 
and distributive lattices. Section 6 discusses applications and extensions of these 
results: A strengthening of the ordering constraints to ordered resolution, the 
extension to boolean lattices and the lifting to the non-ground case. Section 7 
contains a conclusion. 



2 Preliminaries 

Let Ts{X) be a set of terms with signature E and variables in X. Terms are 
identified with E U A-labeled trees with nodes or positions in the monoid N* . e 
denotes the root and n.i the i-th child of node n. A variable is linear {non-linear) 
in a term, if it is the label of exactly one node (least two nodes). For a term t 
and a substitution cr, a skeleton position of ta with respect to t is a node labeled 
by a function or constant symbol in t. A variable instance position of ta with 
respect to t is a node p.q such that p is a node labeled by a variable in t. s[t]p 
denotes replacement of the subterm s|p of a term s at position p by a term t. 
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A quasi-ordered set (quoset) (P, <) is a set P endowed with a reflexive tran- 
sitive relation <. A quoset is a partially ordered set (a poset), if < is also anti- 
symmetric. Poset identify elements that are congruent modulo ~ = (< n >) in 
the associated quoset. 

A join semilattice is a quoset (P, <) closed under least upper bounds (joins) 
for all pairs of elements. For all x,y, z G L, the join xV y oi x and y satisfies 
x\/ y < z iS. X < z and y < z. A meet semilattice is a quoset (L, <) closed under 
greatest lower bounds (meets) for all pairs of elements. For all x,y,z G L, the 
meet x A y of x and y satisfies z<xAyiSz<x and z < y. Joins and meets 
are unique up to ~. \/ P and /\ P denote joins and meets of finite PAL. 

A lattice is both a join and a meet semilattice. The signature L = {V, A} of 
the set Tl{X) of lattice terms can be assumed variadic for flattening terms. A 
lattice {L, <) is distributive, if for all x,y,z G L one of xA{yVz) < {xAy)V(xAz), 
{xV y) A{xV z) < xA {y A z) and therefore the other one holds. The inequalities 
X A {yV z) > {xVy)A(xyz) and (x V y) A (x V z) > x V (y A z) hold in every 
lattice. Lattices are usually defined as partial orderings, but for comparing with 
resolution, quasi-orderings are more convenient. 

A mapping / between quosets A = {A, <a) and B = {B, <b) is monotonic, 
if X y implies /(x) <b f{y), & join homomorphism, if f{x\Jy) = f{x)y f{y) 
holds and a meet homomorphism, if /(x Ay) = /(x) A /(y) holds for all x,y G A. 
Joins and meets are monotonic in both arguments. 

A presentation P for a signature S of some algebra is a pair (G, R), where G is 
a set of constants and R a set of defining identities s ~ t over words s,t G Ts{G). 
The uniform word problem for a class K of A-algebras is the following decision 
problem: Does Th(AT) UP \= s « t hold for the theory Th(iU) of K, for an 
arbitrary finite presentation P = (G,R) and identity s « t over Ts{G) ? It is 
solvable, if there is an algorithm to decide the problem for arbitrary P and s « A 
When K is axiomatized by a finite set E of identities, a solution exists iff the 
congruence defining the quotient algebra of the term algebra Te(G) by E and P 
can be effectively constructed, for instance by a canonical term rewrite system 
for E. We still say uniform word problem, when it is defined by inequalities. 

3 Non-symmetric Rewriting and Completion 

We presuppose the basic concepts and notation of equational rewriting, as given, 
for instance, in [7]. Non-symmetric rewriting is a technique for solving word or 
reachability problems for binary relations that are partitioned into an increasing 
part S and a decreasing part R with respect to some syntactic ordering ^ on 
the universe of the relations, by search along normal-form paths [11, 15] (c.f. [16] 
for a brief introduction in the context of lattices). It is based on the following 
purely relational observations. 
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Lemma 1. Let R and S be binary relations on some set A. 

(i) S*R* C R*S* ijf{RUSy C R*S*. 

(ii) SR C R* S* implies (i?U S)* C R* S* , if {RU S~^) is well-founded. 

Setting S = R~^, lemma 1 (i) and (ii) specialize to the Church-Rosser theorem 
and Newman’s lemma of equational rewriting. We visualize i? U S' as a digraph 
with edges labeled by i? or S and nodes by elements of A. We consider proofs 
or paths along the quasi-ordering (i? U S)* induced by R and S. A peak now is 
a proof of the form SR; a valley a proof of the form R*S* . 

Let R and S induce rewrite relations and A term s rewrites to a 
term t in one step at position p, written as s t or s — t, if there is some 
rewrite rule I r € i?or^— >grG S such that s|p = la for some position p 
and substitution a and t = s[ra]p. This presumes that all predecessor nodes of p 
are labeled by names of monotonic functions. With our notation notation, -^r 
increases and -^s increases from left to right in presence of the syntactic ordering 

This is consistent with the standard arrow notation for clauses. Replacement 
of peaks by valleys is now more complex than for equational rewriting. In general, 
a critical pair is a pair in SR that can possibly not be replaced by a valley. As 
in equational rewriting, there are critical pairs arising from skeleton positions. 
For li ^R ri and I2 i~2, they are either of the form {ha[l2a]p,ria) for a 
most general unifier cr of r2 and h\p or of the form (l20',r2a[ria]p) for a most 
general unifier cr of r2|p and h. But now also critical pairs from variable instance 
positions of non-linear variables arise. 

Example 1 . Let R = {xAa; ^r x}. S' = {a b}. The peak bAa ^g bAb ^r b 
cannot be replaced by a valley. Only a “backward” rewrite step, leading to 
a A a b A a ^g bAb ^r b yields a replacement via a A a — a. 

Backward steps are unnecessary for linear variables. Then every peak can still 
be replaced by a valley. For variable critical pairs the position q of the variable 
instance position p.q can in general not be bounded. These pairs can be repre- 
sented, for instance, by context variables. The variable critical pair of example 1 
then becomes (C[a] A C[b],C[b]). 

Non-symmetric rewriting extends to non-symmetric rewriting modulo a con- 
gruence induced by a set E of equations, as in the equational case [8] . In particu- 
lar, for the associativity commutativity congruence (AC), the cliffs arising from 
ER- or AA-proofs can be eliminated by extended rules. The meet operation, for 
example, is associative and for all r,s,t & Tl{X). A rule r As — t or r ^g tAs 
is extended to {r A s) A x ^r t A x or r A x ^g {t A s) A x, where a; is a fresh 
extension variable. Extending all rules in R and S, it then remains to replace 
syntactic unification by AC-unification. Consider [17] for concise definitions. 

A non-symmetric completion procedure is a transformation on sets of inequal- 
ities that preserves the set of consequences modulo the theory of (monotonic) 
quasi-orderings. On success, its solution allows spanning every consequence by a 
valley. By well-foundedness of R and S~ ^ , valleys can be effectively constructed 
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by searching for a common node of the finite acyclic R- and 5“ ^-digraphs. To 
this end, exactly the critical pairs as part of the transitive-monotonicity closure 
of the input set is computed. Additionally, certain redundant inequalities are 
eliminated on the fly to keep the solution set small. In contrast to the decision 
procedure, well-foundedness of i?U 5”“^ is needed for completion. 

We consider the subcase of non-symmetric AC-completion, where extension 
variables are the only non-ground objects. In particular therefore, all terms are 
linear. Since the deviations from the equational case are small, we only give a 
sketch that can be completed, for instance, using [1] . Let Te (G) be a set of ground 
terms and V a set of extension variables. We use an AC-compatible reduction 
ordering ^ that is total on Te(G), such that /(si , . . . , Sm) -< f{ti , . . . , t„), for 
an AC-symbol /, if {|si, . . . , Sm|} {|ti, ■ ■ • , tn|}, where {|.|} denotes a multiset 
and the multiset extension of ^ for all AC-symbols / ^ . 

We define a non-symmetric completion procedure Cac for linear terms as 
a state transition system. States are triples (/, R, S) of a set I of (unordered) 
inequalities I < r and two sets R and S of ordered inequalities I r and 
I r with I >~ r and I -< r. The initial state is (/ q , 0 , 0 ) with Iq ground, that is 
without extended rules. The transitions are modeled by the following rules. 



{i,R, S) 

{nj{s<t},R, 5 )’ 



(Deduce) 



if (s, t) is an AC-critical pair. This rule can also be written as an inference rule 
on inequalities, using AC-unifiers from a minimal complete set. 



h 1"2 



DL' 2JP 



1 1 • J- 



ha[l2(j]p < ria 



ha < r2a[ria]p 



{lU{s<t},R, S) 

{I,RU{s ^Rt},S)’ 

if s >- t and s ^ t, respectively. 

{i,R, S) 

{I,RU{s^Rt},S)' 

if s ^Rt {s t) is an extension of a non-extended rule in R (S). 

(lU{s<t},R,S) (I,RU{s^Rt},S) {I,R,SU{s^st}) 
{I,R,S) ’ (I,R,S) ’ {I,R,S) 

(Delete) 

if s< t, s^Rt or s^gt can be replaced by a smaller proof defined as follows. 
In particular, every proof of s < s can be replaced by the empty valley. We 

^ Such orderings exist [3, 6] even orienting the distributive laws in the desired direction. 
For our reconstruction of resolution, these orderings can be substantially simplified. 



(/U{s < t},R, S) 
{I,R,SU{s^gt})^ 



(Orient) 



(/,i?, S) 



{I,R,SU{s^gt})' 



(Extend) 
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compare single proof steps (s, t) using (I, r) which is either an inequality in /, an 
(extended or non-extended) Rot S rule or an AC-axiom, by the following tuples: 
by ({s, t}, _L, _L) for ^ < r G I, ({s, t}, _L, r) for I « r an AC-axiom, ({s}, I, r) for 
I r G R non-extended, ({s},_L,r) for I — r G R extended, {{t},r,l) for 

/ — r G S non-extended and ({f},r, A) for I G S extended. Tuples are 

ordered lexicographically using the multiset extension of ^ for the first compo- 
nent, the encompassment ordering for the second one and ^ for the third one. 
Proofs are compared as multisets of one-step proofs by the multiset-extension of 
the one-step ordering. All these proof-orderings are denoted by They inherit 
well-foundedness from their components. 

A run of Cac succeeds, if its limit state is of the form (0, R, S). It is (weakly) 
fair, if every continuously enabled transition is eventually executed. The pair 
(i?, S) is a normal system, if (i?U S)* C R*S* , R and S~^ are well-founded and 
no element of i? or S' can be deleted. 

Under the transition rules of Cac, proofs can be successively replaced by 
simpler ones with respect to 

Lemma 2. For the completion procedure Cac aTid the ordering -< on one-step 
and multi-step proofs. 

(i) The transition rules of Cac induce a well-founded proof transformation re- 
lation with respect to 

(ii) The procedure effectively transforms every I U AC -proof into a valley. 
Lemma 2 leads to the following correctness property. 

Theorem 1. The limit state of a fair successful run of Cac is a normal system. 

This holds in particular, when the procedure terminates. In opposition to equa- 
tional completion, the Delete rules of Cac are search-based and the Deduce 
rule persists in the ground case. The former is the case, since a < b and a < c 
imply neither b < c nor c < b, the latter, since the direction of arrows in the 
Deduce rule does not match with any DELETE-rule. 

4 The Uniform Word Problem for Semilattices 

We now consider the symmetrization, the interaction between a finite presen- 
tation in terms of semilattice inequalities /\ S < /\T, where S,T C G and G 
is finite, and a normal system for semilattices. We normalize the presentation 
and compute its critical pairs with the normal system for semilattices. Since 
the language of semilattices is very simple — ^join is the only operation symbol — 
the AC-ordering ^ is very simple, too: we compare semilattice terms simply as 
multisets of generators. 

Lemma 3 ([11]). Given the AC-compatible ordering the following set Nr 
of rules is a normal system for semilattices. 

xAy^RX, xAy^Ry, x^sxAx, x Ay x Ax Ay. 

s < t, if some subterm of t is an instance of s. 



2 
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The last rule is extended, since meet is associative and commutative. The first 
three rules together imply that x A x < x and x < x A x hold. We can thus 
normalize with respect to idempotence and consider semilattice terms as sets 
and not as multisets. Sets and multisets are also the natural data-structures for 
clauses in resolution. Writing a clause as F — > A, where F and A are multisets 
of atoms, it is straightforward to show that — > is a quasi-ordering. Semilattice 
inequalities can still further be normalized, although not by means of rewriting. 



Lemma 4. Let s he a semilattice term and let T C G he a finite set of genera- 
tors. Then s < /\T iff s < t for all t G T. 

The proof is immediate from the definition of meet. We can thus restrict our 
attention to uniform word problems defined by Florn inequalities of the form 
< t, where S C G and t G G. Translation to the resolution context then 
yields Horn clauses. For our comparison with resolution, semilattices with a 
minimal element 0 and a maximal element 1 are especially important. Hence 
1 A s = s and 0 A s = 0 for all semilattice terms s. In particular 1 < 0 holds 
only in the trivial semilattice consisting of one single element; hence it denotes 
triviality or inconsistency. 

Lemma 5. A finitely presented semilattice is finite and at most of size 2^^^. 

Of course at most 21*^1 different lattice terms can be built from G using the 
data-structure of sets. Consequently, each AC-extension of a ground rewrite 
rule from the presentation and each rule from the normal system in lemma 3 
can be replaced — highly inefficiently — by a finite set of ground extensions. By 
lemma 4, extensions can be restricted to left-hand sides of inequalities. Also 
AC-matching and unification become very simple. Procedurally, these ground 
extensions should of course be lazily generated. It now remains to turn this 
discussion into an inference system. 

Given the AC-ordering A that is total on ground terms, we define a lattice- 
theoretic Horn ordered resolution calculus HOR as a specialization of Cac by 
the following inference rules on semilattice inequalities. We first introduce some 
notation. Let s,s',t be semilattice terms; let a,h,c,... and 01 , 02 , 03 ,... be 
generators. We use both transition and inference rules to obtain a format that 
is close to Horn resolution. 



(/ U {s < oi A • • • A o„}, i?, S) 
(/ U {s < Oi : 1 < z < n}, i?, S) ’ 
s A a A a b 
s Aa < b 

s CL s' A a — b 
s A s' <b 



(Splitting) 

(Idempotence) 

(Resolution) 



The Orient and Delete transition rules are inherited from Cac- Special de- 
rived transition rules are 



(/ U {s A o < a}, R, S) 

(AiC5) 



(LB) 
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and similarly with respect to R and S. We dispense with Extend rules because 
we can finitely represent them. 

Proposition 1. HOR is a normalizing specialization of Cac for semilattices. 
Every fair run succeeds; its limit state is a normal system for the presentation 
modulo the theory of semilattices. 

Proof. Let P = {G, R) be a finite presentation of a semilattice. By lemma 4, the 
Splitting rule is an equivalence transformation to Horn inequalities, whereas 
all other rules in HOR presuppose that format. Without loss of generality we 
assume that all inequalities in P are Horn. 

We now compute the critical pairs between oriented variants of a Horn in- 
equality s < a of P with the rules of N$ from lemma 3. First, there are no 
critical pairs with the lower bound rules x A y x and x A y y, since a is 
a generator and all variables linear. Second, we consider the critical pairs with 
X X Ax. For a rule sAaAa ^r b, there is a critical pair sAa<b. This leads 
to an instance of the Idempotence rule. There is also a variable critical pair 
C[s A a A a] < (^[s A a A a] A C'[6] , which can be finitely represented, according to 
lemma 5, by appropriate lattice terms. Using Splitting, all these critical pairs 
can be deleted in favor ofsAoAa— >^6. Third, we consider the critical pairs 
with X Ay X Ax Ay. Both the skeleton critical pair and the variable critical 
pair contribute nothing new. 

We now consider the critical pairs between two members s < a and s' Aa < b 
of P. This leads immediately to the Resolution rule, depicted in figure 1. 




Figurel. Hasse diagram for Horn resolution 



Algebraically, it can be derived using monotonicity of meet and transitivity. By 
our discussion following lemma 5, we can dispense with extended rules in favor 
of finite expansions of the presentation. But these only yield expressions that 
can be deleted. 

We have now considered all critical pairs among elements of the presentation 
and between members of the presentation and the normal system for semilattices. 
We observe two facts. First, the Horn property is preserved by these Deduce 
steps. Second, all generators in the conclusions of these steps already appear 
in the premises. We can therefore apply the above analysis to an entire run of 
CaC) not only to the presentation. Therefore, the Cac rules specialize to HOR 
rules. Since Orient is trivial, it remains to consider LB. Obviously, every Horn 
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inequality s A a < a or rule s Aa a can be deleted in presence of x Ay y 
from Ns- Also this rule can be recursively be applied at each stage of the process. 

It is now straightforward to inspect that for all rules of HOR, the conclusion is 
smaller than the maximal premise. Therefore the associated proof transformation 
relation terminates. As a consequence of lemma 2 and 1 and of the fact that all 
ground rules can be oriented, we obtain the desired result, that is a normal 
system for the presentation. □ 

Simple examples show that neither ground non-symmetric completion nor un- 
ordered ground Horn resolution always terminate, even in presence of fairness 
assumptions [17]. By lemma 5 and the fact that no rule of HOR introduces new 
variables or constants, however, HOR can from some stage on only produce in- 
equalities with multiple occurrences of some generators at the left-hand side. 
These can of course be deleted by Idempotence. In order to express termina- 
tion modulo idempotence in a generic way, we call an inference redundant, if 
either some premise or the conclusion can be deleted. 

Lemma 6. HOR terminates (up to redundant inferences) for every presentation. 

Theorem 2. HOR solves the uniform word problem for semilattices. 

Proof. For solving the uniform word problem, split a given equational presen- 
tation into inequalities and run HOR to obtain a normal system. Then split 
the word identity /\ S' « /\ T into word inequalities /\S < t for all t € T and 
/\T < s for all s G S and try to connect them by valleys using the normal sys- 
tem, according to the standard decision procedure of non-symmetric rewriting, 
as described in section 3. □ 

The following inconsistency (or triviality) test is a corollary to theorem 2. 

Corollary 1. Let P he a presentation for a semilattice in which 1 < 0 holds. 
Let 0 and 1 he least elements in the precedence for A. Then HOR derives 1 < 0 
from P. 

Proof. If 1 < 0 is implied by P in the theory of semilattices, then there is a valley 
proving this inequality by theorem 2. If 1 and 0 are minimal in the precedence, 
then the only valley can be 1 < 0 itself. Hence the inequality must be in the 
normal system of P. □ 

Corollary 1 can be extended to an algebraic refutational completeness proof as 
follows: There is no meet homomorphism from a semilattice in which 1 < 0 holds 
to the two element lattice 2, since meet homomorphisms are required to map 
zero to zero and one to one and moreover they are monotonic. Thus there is no 
satisfying valuation for this presentation, that is the presentation is inconsistent. 



Lemma 7. Replace the Idempotence inference rule by the transition rule 

{L \J {s A a A a < b} , R, S) 

(/ U {s A a < b}, R, S) 
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in HOR. The new procedure still has the properties of proposition 1, lemma 6, 
theorem 2 and corollary 1. Thus Idempotence can he applied as a simplification. 

Proof. Consider the inequality sAaAa< 6 in a presentation P. By the instance 
a A a a of X Ay — x from Ns it rewrites to to s A a < 6. In general, this 
is not sound for inequalities, but since x x A x is also in Ns and therefore 
X A X < X and x < x A x hold, the step is an equivalence transformation. But 
then s AaAa < b can be deleted. Again the argument can be extended on runs. 
This yields the above transition rule. □ 

5 The Uniform Word Problem for Distributive Lattices 

We now consider distributive lattices and apply the same technique as for semi- 
lattices. Now our signature consists of joins and meets. 

Lemma 8 ([11]). Given two AC-compatible orderings for R- and S-rules^, the 
following set of rules is a normal system Nr for distributive lattices. 

x\/ X — X x\/ x\J y x\J y x Ay ^r x 
X A {y\/ z) ^R {x Ay) y {x A z) {x A {yV z)) A w ^r {{x A y) V (x A z)) A w 
x^sxAx X Ay ^s X Ax Ay x^sxVy 
(x V j/) A (x V z) ^s X y (y A z) ((x V y) A (x V z)) V w ^s (x y (y A z)) V w 

It allows the following normalization of lattice terms. 

Lemma 9. Let L he a distributive lattice, G a finite set of generators. 

(i) Every lattice term t is equivalent in L to a join of meets of generators and 
to a meet of joins of generators from t. 

(ii) \J S < /\T holds for S,T C Ts{G), iff s <t holds for all s G S and t € T. 

(iii) An inequality s < t holds in L, iff there is a set of inequalities f\Si <\/ R 
that hold in L and the Si (Ti) are sets of generators from s (t). 

Proof, (ad i) Since x A (y V z) = (x A y) V (x A z) and x V (y A z) = (x V y) A (x V z) 
hold in a distributive lattice, distributing out terms according to the rules of the 
above rewrite system is an equivalence transformation. 

(ad ii) By lemma 4 for meets and by duality for joins. 

(ad iii) We put (i) and (ii) together. The transformation is effective. □ 

We can thus restrict our attention to uniform word problems defined by inequal- 
ities of the form f\S < \J T, where S,T C G. We write njm{t) (nmj(t)) for the 
normal form of t as a join of meets (meet of joins) of generators. They can be 
determined with only linear blowup in the size of terms [18]. Again we obtain 
the data-structure of sets and can dispense with extended rules. 

® Remind that well-foundedness of the separate orderings is sufficient for the decision 
procedure, but not for completion. 
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Lemma 10. A finitely presented distributive lattice is finite and at most of size 
22 '"'. 

Working with two separate well-founded orderings is not possible for non-sym- 
metric completion. However, one single ordering, giving either join precedence 
over meet or conversely, would be sufficient if either the S'-rules or the i?-rules 
of No dealing with distributivity were discarded. The orientation of the other 
rules in No is invariant under this choice of the precedence. At the level of 
the following calculus, where all inequalities have the form /\S < \f T, where 
S,T C G, the choice is even irrelevant, since joins and are always separated and 
never need to be compared. One can therefore treat S and T equally as multisets 
of generators. 

We define a lattice-theoretic ordered resolution calculus OR for distributive 
lattices as a specialization of Cac by the following inference rules on lattice in- 
equalities. We first introduce some notation. Let u, v be lattice terms, let s, s' 
and si, S2, S3, . . . be terms in the language of the meet semilattice and t, t' and 
tiAi, h, . . . be terms in the language of the join semilattice. We denote genera- 
tors by a,b,c, . . . or oi, 02, 03, . . . or b\, 62, 63, ... . We again distinguish between 
transition and inference rules. 

{I U {u < v}, R, S) 

(/ U {njm{u) < nmj{v)},R, S) ’ 

(/ U {oi V • • • V Qm < bi A ■ ■ ■ A bn}, R, S) 

(/ U {oi < bj : 1 < i < m, 1 < j < n}, i?, S) ’ 
sAoAa— s^gtVaVa 
s Aa < t ’ s < tv a ’ 

s tv a s' A a — t' 
s A s' < tv t' 

The Orient and Delete transition rules are inherited from Cac- Special de- 
rived transition rules are 



(Distribute) 

(Splitting) 

(Idempotence) 

(Resolution) 



(/ U {s A a < a V t}, i?, S) 



(LB/UB) 



and similarly with respect to R and S. 

Proposition 2. OR is a normalizing specialization of Cac for distributive lat- 
tices. Every fair run succeeds; its limit state is a normal system for the presen- 
tation modulo the theory of distributive lattices. 



Proof. The proof is similar to that of proposition 1. We assume a precedence 
where joins are bigger than meets. Again, the Splitting rule, which is defined 
in accordance with lemma 9, is used for preprocessing lattice terms. Here it is 
applied together with the Distribute rule. Since this rule corresponds to an 
equivalence transformation, it is a simplification rule that allows us to discard 
the non- normalized inequality, like for idempotence in proposition 1 . Most of the 
critical pairs have already been considered in the proof of proposition 1, up to 
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lattice duality. It remains to consider the critical pairs among members of the 
presentation and between them and the distributivity laws. By lemma 10 we can 
again dispense with extended rules. 



ty t' tya 



tyf 



tya (s' V t') A (t V t') 



{s' A t) V {s' A a) 
■ \ 



t' {s A s') y {s At) 



R 



(s' V t) A {ay t) ^t' 

\ R 

s A a 



s A s' 



r ' ! ' ! 

s Aa s A s 



s At 



Figure2. Basse diagrams for Resolution 



There are no immediate critical pairs among members of the presentation, 
except when one is a Horn inequality or a dual Horn inequality a < \J T for 
a € G and T C G. However two members of the presentation can have critical 
pairs modulo distributivity in a particular case. 

In order to work consistently with our choice of precedence, we must exclude 
the 5'-rules in Njj dealing with distributivity. We will argue that this restriction 
is justified. The only critical pair between an inequality s —>-3 t y u and the 
distributive inequality x A {yy z) {x Ay)y {x Az) is x As {x At)y {x Au). 
By lemma 10, we can dispense with x again by instantiation with generators. 
In the special case that one of t and u, u say, is a generator a, an additional 
critical pair with an inequality s' A a t' may exist. Otherwise the above 
critical pair can be normalized and then deleted. The additional critical pair is 
s A s' A x' < {s' A x' At) y {x' A t'), where x has been instantiated by s' A x'. 
Normalizing with respect to Distribute, using Splitting and LB/UB, the 
pair simplifies to s A s' A x' < t y t' . Since every instance of this inequality can 
be deleted in favor of s A s' < t y t' , only this last rule must be kept. This 
derivation of an instance of Resolution is depicted in the left-hand diagram 
of figure 2. Algebraically, the dotted lines denote application of monotonicity of 
join and meet; the diagonal of the diagram shows the conclusion. Note that only 
the i?-rule dealing with distributivity has been used. 

Similarly to the preceding paragraph, using a precedence where meets are 
bigger than joins and excluding the i?-rules in Nr dealing with distributivity, 
yields a critical pair (a; V si) A (a; V S 2 ) ^r x y t from si A S 2 ^r t and 
{x y y) A {x y z) xy {y A z). This situation is completely dual to the one 
above (by exchange of join and meet and inverting the inequalities). It leads to 
the situation depicted in the right-hand diagram of figure 2. Both derivations 
deduce the same conclusion from the same premises; nothing new is introduced. 
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This justifies discarding one variant of the distributivity rules. The Resolution 
rule is the most general way in which two members of the presentation can 
overlap modulo distributivity. It covers the case where one of the premises is a 
Horn inequality or its dual. 

The format of the presentation is preserved for all operations and no new 
generators are introduced. Therefore the argument can be extended from gener- 
ators to the whole run of Cac- Together with the results of proposition 1, this 
yields the inference rules of OR. For termination of the proof transformation it 
remains to show that every conclusion of a Resolution rule is smaller than 
the maximal premise. This is not immediately the case, as the ordering con- 
straints t>- s' )^t'>-s>-a for the above rule show. Appropriate instances of 
the distributivity law and monotonicity must therefore be implicitly added for 
enforcing it. Lemma 2 and 1 and the fact that all ground rules can be oriented, 
then yield the desired result, that is a normal system for the presentation. □ 

Our derivation of the resolution rule reveals the surprising fact that it is essen- 
tially an application of the distributivity law. In fact, a lattice (L, <) is distribu- 
tive iff s < t V a and s' A a < t imply s A s' < t V f' for all s, s', t, t', a G L. A 
similar characterization can be found in [12]. 

Lemma 11. OR terminates (up to redundant inferences) for every presentation. 

Theorem 3. OR solves the uniform word problem for distributive lattices. 

Corollary 2. Let P be a presentation for a distributive lattice with 0 and 1 such 
that 1 < 0 holds in that lattice. Let moreover 0 and 1 be the least elements of the 
precedence for A. Then 1 < 0 is in the normal system derived from P. 

The proofs of lemma 11, theorem 3 and corollary 2 are similar to those of 
lemma 6, theorem 2 and corollary 1. Again, corollary 2 can be extended to 
an algebraic completeness proof. 

Lemma 12. Replace the Idempotence inference rules by the transition rules 

{I LI {s A a A a < t} , R, S) (/ U {s < t V a V a}, R, S) 

(/ U {s A a < t}, R, S) (I U {s < t V a}, i?, S) 

in OR. The new procedure still has the properties of proposition 2, lemma 11, 
theorem 3 and corollary 2. Thus Idempotence can be applied as a simplification. 

The proof is similar to that of lemma 7. 

6 Discussion 

Our derivation of resolution calculi from non-symmetric rewriting and comple- 
tion is quite fortunate for several reasons. First, semilattices and lattices are two 
of the few structures for which non-symmetric normal systems have so far been 
given. This has to do with the inherent weakness of non-symmetric rewriting. 
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due to variable critical pairs. Even the derivation of these simple systems (which 
one might even be able to guess) has been based on ad hoc reasoning [11]; how- 
ever a representation of general non-symmetric completion as a second-order 
procedure might be promising. Second, finiteness of finitely presented semilat- 
tices and distributive lattices lead to a simple circumvention of context variables, 
that otherwise might have caused difficulties. 

The precise translation from lattice inequalities to clauses is straightforward; 
we leave it implicit. The calculi described so far are weaker than the standard 
ordered resolution calculi (c.f [10]), since we have not required that maximal 
terms are cut out. Only for boolean lattices this restriction still yields a so- 
lution to the uniform word problem, whereas for semilattices and distributive 
lattices, too few critical pairs are computed. A simple proof permutation argu- 
ment (c.f. [17]) shows that inconsistency tests are still possible with the stronger 
constraints. Hence only for boolean lattices is the situation similar to the Buch- 
berger algorithm. Algebraically, the ideals of boolean lattices and rings have 
similar properties, whereas those of distributive lattices and semilattices be- 
have differently. Our methods are in particular important for the boolean case, 
where — in opposition to the distributive case — no equational canonical rewrite 
system exists [14]. 

Our calculi can be lifted to the non-ground case. There are no variable criti- 
cal pairs; predicate symbols are interpreted as non-monotonic operations which 
block the application of rewrite steps below the lattice level. An ordered or unfail- 
ing version of non-symmetric completion should be used to obtain the ordering 
restrictions of resolution. Equational ordered or unfailing completion [13, 2, 1] 
is a semi-decision procedure for word problems. Critical pair computations are 
performed not on ordered equations, but on orderable instances. For many appli- 
cations, unique normal forms only for ground instances of terms (which is weaker 
than for terms) suffices. For instance, all ground instances of the commutativity 
law are orderable, whereas the law itself is not. Consequently, ordered completion 
is based on syntactic unification and AC-compatible orderings are superfluous. 
However, a detailed comparison between the methods of non-ground ordered 
resolution and equational ordered or unfailing completion is left for future work. 

7 Conclusion 

We have reconstructed ground ordered resolution algebraically as a solution to 
lattice-theoretic uniform word problems based on a generalization of Knuth- 
Bendix completion to non-symmetric transitive relations. In fact, a variant of 
the resolution rule yields exactly the specific difference between lattices and dis- 
tributive lattices. Our construction is analogous to the simulation of the Buch- 
berger algorithm by equational completion. We essentially used normalization 
techniques to integrate the effect of distributive lattice theory in the presentation. 
The refutational completeness proof of resolution and redundancy elimination 
techniques can thus be reduced to those of non-symmetric rewriting. Beyond the 
results presented in this text, lifting of resolution shows interesting correspon- 
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dences to unfailing completion and a more abstract ideal-theoretic comparison 
between our lattice word problem and the Buchberger algorithm could lead to 
structural insights. 
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Abstract. We show how to derive refutationally complete ground super- 
position calculi systematically from convergent term rewriting systems 
for equational theories, in order to make automated theorem proving in 
these theories more effective. In particular we consider abelian groups and 
commutative rings. These are difficult for automated theorem provers, 
since their axioms of associativity, commutativity, distributivity and the 
inverse law can generate many variations of the same equation. For these 
theories ordering restrictions can be strengthened so that inferences ap- 
ply only to maximal summands, and superpositions into the inverse law 
that move summands from one side of an equation to the other can be 
replaced by an isolation rule that isolates the maximal terms on one side. 
Additional inferences arise from superpositions of extended clauses, but 
we can show that most of these are redundant. In particular, none are 
needed in the case of abelian groups, and at most one for any pair of 
ground clauses in the case of commutative rings. 



1 Introduction 

Automated theorem provers face problems when they are used on theories whose 
axioms generate large search spaces. Overwhelmed by a huge number of trivial 
consequences of each fact, they fail to prove even rather simple theorems. 

Our goal in this work is to improve the methods for superposition theorem 
proving in the context of algebraic theories. To avoid a separate completeness 
proof for each theory and to gain a better understanding of the general mech- 
anism we develop a framework that allows us to derive superposition calculi 
systematically from convergent term rewriting systems for algebraic theories. 
This framework consists of a parameterized superposition calculus, where the 
parameters are a term ordering, a simplification function and a symmetrization 
function. We assume certain properties of the parameters which allow us to prove 
refutational completeness of the parameterized calculus. For many important al- 
gebraic theories such as abelian groups, commutative rings, modules or algebras 
suitable parameters exist (Stuber 1999a). We use the theory of commutative 

* Complete proofs can be found in 

http : //www.mpi-sb.mpg . de/~ juergen/publications/Stuberl999Diss . html. 
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rings as our main example, since it is important in many applications and allows 
us to demonstrate our approach to critical pairs of extensions. For less well- 
behaved algebraic theories, e.g. modules over rings with zero divisors, certain 
critical pairs in the symmetrization cannot be made to converge uniformly and 
must be considered explicitly; such theories would require an extension of our 
approach. 

We arrive at calculi which are improved in several respects. First, stronger 
ordering restrictions require that inferences apply only to certain maximal sub- 
terms; in the case of commutative rings these are maximal summands within 
the top-level sum. Second, we replace some direct uses of axioms by macro infer- 
ences. Superposition inferences into theory equations, which would lead to many 
variants of a clause, are replaced by special simplification rules, such as isolation 
for the inverse law, and by introducing semantic matching into the superposi- 
tion rule. We formalize this by associating to each original equation an extended 
set of term rewriting rules, called its symmetrization. By implicitly using these 
extensions for semantic matching, we avoid to explicitly add the correspond- 
ing extended clauses. Instead of superposition inferences between such explicitly 
represented extensions the calculi contain an extension superposition rule that 
is needed to accommodate critical peaks between the implicit extensions. Since 
the form of the symmetrization is known for any particular theory, we can derive 
redundancy criteria which dispose of all or most of these inferences. In the case 
of abelian groups no such inference is needed, while in the case of commutative 
rings at most one extension superposition inference is needed for any pair of 
ground clauses. The combination of stronger ordering restrictions, macro infer- 
ences and redundancy criteria promises to be much more efficient than a more 
general calculus applied to part of the axioms. For instance, in purely equational 
reasoning it has been demonstrated that special calculi can improve performance 
greatly (Zhang 1993, Marche 1996). 

Our goal is to obtain refutationally complete calculi for arbitrary first-order 
formulas, without restrictions on the logical structure or the set of function sym- 
bols. Here we consider only the ground case, as we believe that lifting the general 
calculus would offer no substantial new insights. More is possible by consider- 
ing specific theories separately. For instance, on the ground level Godoy and 
Nieuwenhuis (2000) use the same approach for the theory of abelian groups, but 
they represent nonground equations uniformly as t « 0. This way of abstracting 
away information about the maximal summand during lifting has the possible ad- 
vantage of avoiding inferences with group axioms, which are replaced by abelian 
group unification. Lifting is a problem for unshielded variables, i.e. variables 
which do not appear below a free function symbol. In general these variables 
must be split into a sum of a maximal and a nonmaximal part, which gives rise 
to very prolific inferences. For certain cases these problematic variables can be 
eliminated (Waldmann 1998, Stuber 1998a). 

If, on the other hand, we restrict the ground case for commutative rings 
further to the case of ground equations over a finite set of constants, then our 
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calculus generates essentially the same inferences as the Grobner base algorithm 
for polynomials over the integers (Kandri-Rody and Kapur 1988). 

Our work builds on several strands of research: automated first-order theorem 
proving, term rewriting, and the theory of Grobner bases. 

We prove refutational completeness by showing that our calculi reduce min- 
imal counterexamples, which is the standard method for completeness proofs of 
superposition calculi (Bachmair and Ganzinger 1998). Wertz (1992) is the first 
to build superposition calculi for theorem proving modulo E, and in particular 
modulo AG. He uses interpretations where transitivity holds universally but E 
holds only below a certain bound. In contrast to this, the approach of Bachmair 
and Ganzinger (1994a) to AG-superposition sacrifices universal validity of tran- 
sitivity to get universal validity of AG. This allows them to handle AG-matching 
and AG-unification as black boxes, and in many important cases, for instance 
when simplifying by rewriting, the bound on transitivity is satisfied. We follow 
the second approach. Nieuwenhuis and Rubio (1997) and Vigneron (1994) con- 
sider superposition calculi modulo AG with constraints. Bachmair, Ganzinger 
and Stuber (1995) develop a calculus for commutative rings. Their proof tech- 
nique was not strong enough to avoid certain shortcomings, namely the explicit 
representation of the symmetrization and the weaker notion of redundancy. Su- 
perposition calculi for cancellative abelian monoids require a notion of rewriting 
on equations instead of terms, since additive inverses are in general not available 
(Ganzinger and Waldmann 1996, Waldmann 1997). The special case of divisible 
torsion-free abelian groups allows us to eliminate unshielded variables, which 
avoids the most prolific inferences (Waldmann 1997, Waldmann 1998). Previ- 
ously we have shown that our approach is compatible with constraints for the 
special case of integer modules (Stuber 1996, Stuber 1998a). We have also carried 
it out for commutative rings in the ground case (Stuber 1998b). 

In term rewriting Marche (1996) builds a range of theories from AG to com- 
mutative rings into equational completion. He explicitly adds symmetrizations 
and is not concerned with redundancy criteria for extensions. Like standard 
completion it does not handle unorientable equations, hence inferences below 
variables are not needed. Using the Gime system for completion with built-in 
theories (Gontejean and Marche 1996), Marche demonstrates that the special 
treatment of theories can reduce the number of inferences greatly and can lead 
to large speedups. 

The notion of symmetrization originates from string rewriting systems for 
finitely presented groups Greendlinger (1960), and is generalized by Le Ghenadec 
(1986) to various other theories. 

Another strand of research leading to this work is concerned with Grobner 
or standard bases for polynomial simplification (Buchberger 1970, Buchberger 
1987, Becker and Weispfenning 1993, Bachmair and Tiwari 1997). 

The relation between completion for term rewriting systems, which is the 
basis of our calculus, and Grobner basis algorithms has been noticed by Buch- 
berger and Loos (1983). Biindgen (1996) formalizes this by encoding Grobner 
basis computation, including the computation in the base rings, in term rewrit- 
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ing systems. Bachmair and Ganzinger (1994b) use constraints to abstract away 
computations in the base ring. 

As the term ordering for the case of abelian groups or commutative rings we 
can use an ordering based on the MAPO (Delor and Fuel 1993), while modules 
and algebras require a TPO (Stuber 1999b). 



2 Preliminaries 



We assume familiarity with first-order logic and term rewriting systems (Baader 
and Nipkow 1998, Dershowitz and Jouannaud 1990), and in particular the case 
modulo E (Jouannaud and Kirchner 1986). We denote rewriting by syntac- 
tical equality by «, and the empty clause by _L. 

For a set R of rewrite rules gnd{R) denotes the set of their ground instances. 
By [~'](s « t) we denote a literal that is either positive or negative; corresponding 
literals are of the same polarity. 



3 The Term Rewriting System 



We require that a theory is represented by a ground term rewriting system T 
that is convergent modulo an equational theory E. That is, T is terminating 
and Church- Rosser modulo E. Then for any equational proof s 4^tue t there 
exists a valley proof s s' t' -^t t. To avoid explicitly mentioning E- 
matching everywhere we assume that it is included in T. That is, T = {/' r | 
V =E I, I => r ^ gnd{T')'\ for some term rewriting system T' . We assume a fixed 
set of function symbols E. A function symbol / is free in T if there exists a 
possibly nonground term rewriting system T such that T = gnd{T) and / does 
not occur in T. Function symbols which are not free are called interpreted. The 
set of interpreted function symbols is denoted by Et- Terms with a free function 
symbol at the root position are called atomie and will be denoted by a. We let 
Ti = T U E U Eq, where the rules in T are understood as equations and Eq is 
the first-order axiomatization of equality for E. This is the logical contents of 
the theory, with equality made explicit. 



4 The Termination Ordering 

We require an E-compatible simplification ordering >t that is total up to E on 
ground terms such that contains the rewrite system T. We will usually omit 
the subscript T as the ordering used will be clear from the context. An atomic 
term a is called a maximal atomie term in s if s = u[a, a\, . . . , a„] where n > 0, 
are atomic, u contains only function symbols from Et, and a A 
for i = 1 . . .n. 
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To extend the term ordering to literals and clauses we assign to each of 
these a complexity c. For literals we let 

c{s^t) = {{s},{t}} (1) 

c(s 9 ^ t) = {{s,t}} ( 2 ) 

and on literals is the two- fold multiset extension of on terms applied to these 
complexities. This has the effect that the ordering on literals is the lexicographic 
combination of on the maximal term, the ordering — > — h on the polarity of 
the literal and on the minimal term. For a clause C = Ti V . . . V that is 
not an instance of transitivity we let 

c(C) = ({c(Ti),...,c(T„)},0) (3) 

That is, the complexity of a nontransitivity clause is pair of the multiset of 
the complexities of its literals and the empty multiset. This uses the standard 
definition of the clause ordering by Bachmair and Ganzinger (1994a) in its first 
component. To extend it to transitivity we consider a ground instance 

D = t\ ^ s \J s 76 t 2 V ti Ke t 2 



and let 



We say that D has the middle term s and the side terms t\ and t^. Then the 
ordering on clauses is the lexicographic combination of the three-fold multiset 
extension of the term ordering and the multiset extension of the term ordering, 
applied to the complexities. By this definition transitivity instances with a middle 
term s are immediately below nontransitivity clauses with maximal term s in the 
term ordering. We call the middle term of transitivity instances and the maximal 
term of other clauses the dominating term of the clause, since it dominates the 
term ordering. It is possible to choose other well-orderings for the terms on the 
smaller side of literals (Bachmair and Ganzinger 1998) and for the side terms of 
a transitivity instance. 



5 The Symmetrization Function 

The symmetrization function is at the heart of our approach. It maps a single 
ground equation into a logically equivalent set of ground rewrite rules that be- 
haves better in combination with the theory. That is, we consider terminating 
term rewriting systems of the form 

T U [J St{s « t), 

S^t 

where the set of rules St { s ~ t) is designed such that s « t becomes true and as 
much as possible of T and the equality axioms hold. It turns out that this works 
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well for all axioms except transitivity, which for some theories causes problems 
due to nonconvergent peaks between extended rules. 

We start by the notion of a set of rules being (strongly) symmetrized. Being 
symmetrized is a rather technical notion that is required by our general super- 
position calculus. It amounts to the convergence of all critical pairs that involve 
some rule from T or equation from E, and hence validity of the corresponding 
instances of transitivity. The notion of a strongly symmetrized set of rewrite 
rules becomes important when we later instantiate the general framework by 
specific theories. Strong symmetrization allows us to reduce equational proofs 
by normalizing the terms in the proof (Stuber 1997). 

A set of rewrite rules S is symmetrized with respect to T modulo E if for all 
peaks ti 4=r t and for all cliffs t\ t we have t\ JJ-rus ^ 2 - 

The set S is called strongly symmetrized with respect to T modulo A if S' 
can be partitioned into sets Si, i G I, such that T U Si is convergent modulo E 
for all i G L 

Proposition 1. If a set of rewrite rules S is strongly symmetrized with respect 
to T modulo E then S is symmetrized with respect to T modulo E. 

Note that S being strongly symmetrized implies that peaks of the form ti -G=Si 
t ^ Si I 2 converge, which is not guaranteed if S is symmetrized but not strongly 
symmetrized. However, this is still much weaker than convergence, as peaks of 
the form t\ ^Si ^ ^2 need not converge for i ^ j- 

Our goal is to derive (strongly) symmetrized sets of rules directly for some 
given equation, so that the equation becomes true in the rewrite system. We 
break this into two steps. First the equation is brought into a certain theory- 
specific normal form by simplification, and then for any such equation a sym- 
metrized set is obtained by applying a symmetrization function. For now we 
only assume that a set Norm^ of equations this normal form is given, and post- 
pone the discussion of simplification. We continue by discussing symmetrization 
functions. 

A (strong) symmetrization function St (for T) maps any equation Z « r in 
NormT’ to a (strongly) symmetrized set of rewrite rules St{1 ~ r) such that 



Ti U {/ « r} h St{ 1 ~ r) (4) 

^ -IItuSt r (5) 

St{1 « r) C (^) (6) 

I' >: I for any I' r in St{1 ~ r) (7) 



(4) ensures soundness, (5) ensures that I « r becomes true, (6) ensures ter- 
mination, and (7) ensures that terms smaller than I cannot be rewritten by 
St{1 ~ r). We call a rule I' r' in St{ 1 ^ r)\{l ^ r} an extension (of I ^ r). 
The symmetrization function is extended to sets of equations in Normy by 

St{R)= U Sril-r). 

Isir&R 
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Assumption 2. Wt assume from now on that St is a symmetrization function 
for T modulo E. 

To obtain a symmetrization function one starts with a set S containing a sin- 
gle rule in Norm^ and proceeds by adding to S critical pairs resulting from 
peaks of the form ti s t 2 , in a way analogous to Knuth-Bendix com- 
pletion. To obtain a strong symmetrization function one also has to consider 
critical peaks of the form ti -t=s s =^s ^ 2 - For the commutative theories that 
we consider here it turns out that the symmetrization function obtained by con- 
sidering the first kind of peaks also makes the second kind converge. Thus the 
strong symmetrization property requires no extra effort in these cases. Without 
commutativity, however, an equation may have nontrivial overlaps with vari- 
ants of itself. It is infeasible to derive a strong symmetrization function in that 
case, hence for instance Le Chenadec (1986) uses ordinary symmetrization for 
nonabelian groups. 

6 Candidate Models 

In this section we define a model functor / that maps any set N of ground clauses 
to an interpretation 1^. We show that In satisfies the theory and the equality 
axioms except for transitivity. 

The construction of the interpretation extends the standard one by Bachmair 
and Ganzinger (1998) in several respects. 

Firstly, rewriting is modulo E. Secondly, the built-in term rewriting system T 
is always included when constructing the interpretation. This ensures that these 
interpretations satisfy T. Thirdly, we have the additional restriction that a clause 
can be productive only if the rule it produces is in Norm^. Finally, the term 
rewriting systems are built from symmetrizations of rules, which ensures that 
they are always symmetrized. 

A ground clause C V s « t is called reductive for s ^ t if s « t is strictly max- 
imal in C and s y t. Only reductive clauses can contribute to an interpretation. 
Given a set N of ground clauses, we let Nq be the set of ground clauses in N 
which are smaller than C . For any set N of ground clauses we inductively define 
a set Rn of ground rules, a symmetrized set Sn = St{Rn) of ground rules, and 
the corresponding interpretation In = {T^ Sn)^. Here (T U Sn)^ denotes the 
valley closure of T U Sn, i-e., the set {s « t | s U-tuSn t}- We may regard R, S 
and I as functions which map sets of clauses to sets of rewrite rules or equations. 
A rule {I ^ r} is in Rn if there exists a clause C = C yl~r in N such that 
(i) C is false in Inc, (F) C is reductive for I ^ r, (iii) Z ^ r is in Norm^, (iv) I is 
irreducible by Snc, and (v) C is false in (T U Snc U St{1 ^ r))^. In this case 
we say that C produces I ^ r in Rn, or that C is productive. The set Rn is 
well-defined, since for any ground clause C only the interpretation for smaller 
clauses in Nc determines whether C produces a rule. Where N is clear from the 
context we write Rc for Rnc, for Snc and Ic for Inc- 

Lemma 3. Let C = C V I ~ r be a clause that produces I ^ r in Rn- Then 
C is false in In- 
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We let Tq = Refl U Symm U Mon U EUT. 

Lemma 4. Let N be a set of ground clauses. Then In \=Tq. 

It remains to consider instances of transitivity and clauses in N. These are in 
general not true in In- Validity of transitivity instances with middle term up 
to s in In is equivalent to the Church-Rosser-property of T U Sn on terms up 
to s (Bachmair and Ganzinger 1994a). For commutative rings there are cases 
where two extended rules overlap in such a way that the resulting critical pair 
does not converge. Then T U Sn is not Church- Rosser and transitivity does not 
hold. We say that a clause C in Trans U V is a counterexample for In if C is 
false in In- Since the set of possible counterexamples is well-ordered by there 
is always a minimal counterexample if In is not a model of V U Ti . 

7 Redundancy of Clauses and Simplification 

We will need to refer to the specific construction of candidate models when we 
prove that certain clauses or inferences are redundant. In particular, we need that 
candidate models are built from (strongly) symmetrized sets of rewrite rules Sn, 
and we need to refer to the presence of certain rewrite rules in Rn- We achieve 
this by defining a special notion of consequence that takes into account only 
interpretations constructed by the model functor I, and by introducing a new 
atomic formula s t that is true in such an interpretation In whenever the rule 
s t is in Rn- Note that the Rn corresponding to In will always be known 
from the context via the set N of clauses. These atoms will be used only as unit 
clauses, and we will refer to them as rewrite rules. For sets of clauses or rewrite 
rules Ni and N 2 we say that N 2 is an I -consequence of W, in symbols Ni (=/ N 2 , 
if Im h= -^1 implies Im h -^2 for all sets of ground clauses M. Lemma 4 can 
then be rephrased as \=i Tq. 

Let C be some ground clause. We write Transc for the set of ground instances 
of transitivity in Trans which are smaller than C . The middle term of such an 
instance of transitivity is smaller than or equal to the dominating term of C . 
Then C is redundant (with respect to T) in a set of ground clauses N if 

Nc U Transc \=i C. 

A (possibly nonground) clause is called redundant in a set of clauses N if all its 
ground instances are redundant in the set of ground instances of N. A clause 
is called redundant if it is redundant in 0. A clause that is redundant in N 
cannot be the minimal counterexample for In, because some smaller clause in 
Nc U Transc would have to be a counterexample for In as well. Note that 
we can use Nq U Transc U Tq )= C as a sufficient criterion for redundancy. 
This criterion corresponds to the notion of redundancy used by Bachmair and 
Ganzinger (1998). 

Based on our notion of redundancy, we say that a ground clause D is a 
simplification (with respect to T) of a ground clause C ii {C} T\ D and C 
is redundant in {D}. That is, C 'c D, {C} UTi ^ D, and {D} UTransc \=i C. 
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Remember that the symmetrization function is only defined on equations 
in Normp. Equations not in this form need to be simplified before symmetrization 
can be applied. However, we want to restrict simplifications as much as possible, 
since ground simplifications become inferences when they are lifted. We formalize 
this by assuming that there exists a simplification function Sirnp^ which maps 
ground literals to sets of ground literals, such that L' is a simplification of L 
for all L' e Simp 2 ^(E). We say that Simp^ is admissible with respect to Norm^ 
if {L I Simpy(E) = 0} C Normy where Normp is extended to literals in the 
obvious way. Since is well-founded, it suffices to nondeterministically apply 
simplifications in Simp^ to eventually reach a literal in Norm^. 

Assumption 5. From now we assume Simp^ to be admissible with respect 
to Normp. 

The definition Simp^ imposes certain properties on Normy. Since any literal 
can be simplified to some literal in Normy, and since simplification preserves 
Ti-equivalence, for any literal there is a Ti-equivalent literal in Norm^. This 
literal is in general not unique, since we want to avoid unnecessary reductions 
in right-hand sides of equations that would lead to inferences on nonmaximal 
summands. Moreover, for strong symmetrization functions the requirement that 
I is minimal among the left-hand sides of rules in St{1 ^ r) translates into the 
requirement that equations in Norm^ are left-minimal. That is, I is minimal 
among the greater sides of Ti -equivalent equations. For if this were not the 
case, say there exists I' « r' with V r' and I >- V then V must be reducible 
by St{1 ~ r), by some rule with left-hand side smaller than 1. 

8 The Inference System 

We present a ground inference system that is based on the parameters intro- 
duced in the previous sections, namely the term rewriting system T, the order- 
ing y, the set Normr, the symmetrization function St and the simplification 
function Simpj^. 

We assume that in each ground clause a literal is selected; either some arbi- 
trary negative literal, or a positive literal that is maximal in the entire clause. 
An inference system is a set of inferences. Each inference has a main premise C, 
side premises Ci, . . Cn, and a conclusion D. These notions allow uniform def- 
initions of reduction property and redundancy of inferences. Think of the main 
premise as the minimal counterexample, the side premises as productive clauses, 
and the conclusion as a smaller clause that is also false, but that is not in N. 
Thus, the main premise may either be a clause supposed to be from N, then we 
write 

Cl ... Cn C 

D 

for the inference, with the main premise at the right. Or the main premise may 
be an instance of transitivity, then we omit it and write 

Cl ... Cn 

D 
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for the inference. In this case we state the main premise explicitly. Instances 
of transitivity cannot be side premises. An inference is strictly decreasing if the 
conclusion is smaller than the main premise in the clause ordering. All inferences 
that we present are sound and strictly decreasing. 

We let Supj. be the set of the following inferences: 

Let li ^ ri and h ^2 be rules in Normy and Si = Srik rt) for i = 1, 2. 
An extension peak between r\ and I2 r2 with respect to T is a rewrite 

sequence 



f'l l'Al' 2 ] =>S2 li[r2] 



such that I'i ^ r ■ is a rule in S'i for i = 1, 2, Zi is irreducible by T U S'2, and I2 
is irreducible by T U . 



T-Extension Superposition 



li Ki n V Cl I 2 ^ r2 V C 2 
r[ « l[[r'^] V Cl V C2 



if (i) li « ri is in Norm^ and selected in U k, ri y Ci for i = 1,2, and 
(ii) there exists an extension peak r'^ l'i[l'2\ 

between li ^ ri and I2 ^ ^2. 



The transitivity instance corresponding to the peak. 



Ti 9^ li[l2] V li[l2] 9^ ^i[’*2] V 



2J) 



is the main premise of this inference. The explicit premises are side premises. 

Ly C 



T -Theory Simplification 



L' y C 

if (i) L is selected in T V C, and (ii) L' G Simpy(T). 

p^ qy c 



T-Reflexivity Resolution 



C 



if (i) p 76 g is in Norm^ and selected in p 96 g V C, and (ii) p=E q- 

rr. T. , r. SWtVs'wt'VC 

T-Equahty Factortng t ^ t' y s' ^ t' y C 

if (i) s « t is in Norm^ and selected in s « t V s' « t' V C, and (ii) s =e s'. 



The single premise of Theory Simplification, Reflexivity Resolution and Equality 
Factoring is their main premise, they have no side premises. 



T -Superposition 



l^ryD H(s[n«i)VC' 
H{s[r'] K.t)y c y D 



if (i) I' => r' is in St{1 ~ r), (ii) I' =e I" , (iii) [~'](s[^^^] ~ t) is selected 
in [~'](s[^"] ~ t) y C and in Norm^, and (iv) Z « r is in Normy and selected 
in / « r V D' . 
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Superposition has the main premise ~ t) y C and the side premise 

I Ki r y D. 

An inference with main premise C, conclusion D and side premises 
Cl, . . . ,Cn, where Ci = C[ y li m Vi is reductive for k ^ ri, is redundant 
in N if 

Nc U Transc U | z = 1, . . . , n} U {-'C'- | z = 1, . . . , n} _D. 

An inference is called redundant if it is redundant in 0. Here we exploit that side 
premises arise from productive clauses. Hence each Ci has the form C[ y li k, rt 
and is reductive for li ri which is in Norm^. We may assume that h ^ ri is 
in i?AT and that C[ is false in /at. 

Let C be the minimal counterexample for Ijq and let tt be an inference with 
main premise C, conclusion D and side premises C\,. . .,Ck such that C > D, 
and each Ct is smaller than C, has the form Ci = U ~ rt V C[ and is reductive 
for li ^ ri. We say that tt reduces C (with respect to In) if 

In H ~^D a li ^ ri A ... A Ik ^ rk A ~^C'i A ... A ~^Ck- 

An inference system Calc has the reduction property (with respect to I) if 
Calc contains an inference that reduces C with respect to In for any set N 
of ground clauses such that In has the minimal counterexample C _L. A refu- 
tation of a clause set in a calculus Calc is a sequence Ci, . . . , of clauses 

such that each clause Ci is either from N or can be derived from clauses earlier 
in the sequence by an inference in Calc, and C„ = _L. A calculus is refutation- 
ally complete for Ti if for any set N of clauses such that Ti U A' is inconsistent 
there exists a refutation of N in Calc. Calculi with the reduction property are 
refutationally complete (Bachmair and Ganzinger 1998). 

Lemma 6 (Extension Superposition). Let N be a set of ground clauses such 
that N does not contain the empty clause. Suppose that the minimal counterex- 
ample C for In is an instance of transitivity. Then Supj- contains an Extension 
Superposition inference that reduces C . 

Proof: Let C be the minimal counterexample and let s be its middle term. Since 
C is minimal, instances of transitivity with smaller middle terms are true in In . 
This implies that T U Sc is Church-Rosser modulo E below s, but that there 
exists some peak t\ s ^ ^2 such that t\ and t^ do not converge and t\ « ^2 is 
false in In. As T is convergent and Sc is symmetrized modulo E with respect 
to T, all peaks involving T and all cliffs with E converge, so both rules used in 
the peak are from Sc- If the rewrite steps in the peak were in parallel positions 
of s then t\ and t^ would converge, which is not the case. Let ^ r) and 
I 2 r '2 be the rules from Sc used in the peak. For z = 1, 2 the rule /' ^ r' is 
from some symmetrization St{U => ri) where U ^ is a rule in Rc that has 
been produced by some clause Ci = k ~ ri y Cf If we suppose without loss 
of generality that l\ >- I 2 then li is irreducible by Srih ^ ^’ 2 ) because this is a 
condition for Ci being productive, and I 2 is irreducible by St{Ii ^ ri) because 




240 Jurgen Stuber 



l\ > I 2 and I 2 is minimal among the left-hand sides in Sxih ^ r 2 )- Hence this 
is an extension peak of the form 

ti = s[r'J -4= s[l[[l 2 ]] => s'[r'^\ = t 2 - 

Since C is the minimal counterexample, the context must be empty, and the 
peak has the form 



For such a peak Supj. contains the Extension Superposition inference 

« ri V C[ ^2 « V C'2 
r'l « \J C[\J C'2 

where C[ is false in Iq and k ^ n is in Rn for i = 1,2. Since C[, C '2 and 
r'l « are false in 1^, the conclusion is false in Hence the inference 

reduces C. □ 

For each of the other minimal counterexamples, which arise when some condition 
for productivity is violated, there is an inference that reduces it. Due to lack of 
space we have to skip the detailed proof; it is analogous to the case of standard 
superposition. 

Theorem 7. Supj. has the reduction property. 

Putting things together and making all the conditions explicit we obtain the 
main theorem, stating that our method yields refutationally complete calculi: 

Theorem 8. Suppose that T is a ground term rewriting system that is confluent 
modulo E, y is an E-compatible simplification ordering that is total up to E 
on ground terms, T C (;^), Simp^ is a simplification function, and St is a 
symmetrization function for T modulo E with respect to S such that Simpj, is 
admissible with respect to St- Then Supj. is refutationally complete for T\. 

Variations are possible, for instance it is straightforward to replace equality 
factoring by factoring and merging paramodulation. 

9 Extension Peaks Revisited 

In the Extension Superposition rules stated above any extension peak between 
two rules leads to an inference, leading to a large or infinite number of inferences 
for any pair of clauses whose symmetrizations overlap. For specific theories we 
can do much better by exploiting the known structure of the symmetrizations. 
For the theories that we have considered either none or a single Extension Super- 
position inference suffices; all other such inferences are redundant. We call the 
extension peaks that give rise to these Extension Superposition inferences criti- 
cal extension peaks. Furthermore, critical extension peaks are the only cause of 
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transitivity counterexamples. We exploit this to relax the bound on transitivity 
somewhat, by bounding only subterms that occur as the middle terms in critical 
extension peaks. We need this extension in the case of commutative rings, where 
the bound is then on single summands instead of on the whole sum. 

Let Si — Sxik ^ Ti) for i = 1,2. Consider some extension peak t\ 
s =^S 2 ^2 between rules l\ r\ and I 2 ^ 2 . We call such an extension peak 
redundant in N if all Extension Superposition inferences that have the peak as 
their main premise are redundant in N. We call the extension peak redundant 
if it is redundant in 0. 

Lemma 9. Letti <=Si s ^$2 ^2 be ari extension peak between rules h ^ ri and 
h ^ r 2 - The peak is redundant in N if and only if the extension superposition 
inference 



h ^ ri ^2 » r2 
t\ « t2 

with main premise 9 ^ s V s 9 ^ t 2 V « t 2 redundant in N. 

10 Commutative Rings 

Due to space limitations we can only point out a few of the most important 
aspects for the case of commutative rings. We use the well-known convergent 
term rewriting system modulo AC for commutative rings of Peterson and Stickel 
(1981) and call it CR. NormcR consists of equations of the form nf) « r where 
(j) is irreducible with respect to CR, 4> S r, n > 1 and cj) = ai ■ ■ ■ Uk ior k > 0, 
with (f) = 1 for k = 0. Here we use ncj) as a shorthand for 4>i + ■ ■ ■ + (/)n where 
(fi =AC • • • =AC 4>n- For n > 0 we also use {—n)(j) for n{—(j)), and Ocj) denotes 
just 0. A term n(f> is called a monomial. We have the following symmetrization 
for commutative rings: 

5cr(q: « r) = {a r} if a is atomic or 1; 

5cr(<(' ^r) = {4>^r} 

U gnd{{y • 4> ^ y • r}) if is a proper product; 

ScRincj) « r) = {ncj) ^ r} 

U gnd{{x + ncj) ^ x + r}) 

U gnd{{n{y •(/))=» y • r}) 

U gnd{{x + n{y ■ (f) ^ x + y ■ r}) 

U {—(j) ^ (n — — r} 

U gnd({-(y ■ ^ (n - l)(y ■ - (y ■ r)}) if n > 2. 

Extension peaks arise between extensions for multiplication. We obtain the fol- 
lowing criterion, which leaves at most a single nonredundant extension peak for 
any pair of rules: 
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Theorem 10. Let rii(j)i ^ Vi be a rewrite rule in NormcR for i = 1,2, and 
assume without loss of generality n\ > U 2 - 

These two rules have the single critical extension peak 



nipi ^ mcj) ^ (ni - n 2 .)(j) + V 2 'f 2 



where =ac \-Cm{(j)i,(j) 2 ) =ac Mi =ac M 2 if (1) tpi 4^ 1, (2) m > ri 2 or 
ip 2 4 i^) either (a) ni,ri 2 > 2, (b) n\>2, U 2 — 1 and 4>2 is a proper 

product with gcd{ 4 >i, 4>2) 4 or (c) ni = ri2 = 1 and 4 >i and 4>2 ore proper 
products with gcd(^i,(/) 2 ) 4 1- 

Otherwise there is no critical extension peak between these two rules. 



Since only few instances of transitivity are not redundant, the truth of transi- 
tivity up to some bound implies that transitivity is true even up to the next 
nonredundant instance. Since by Theorem 10 nonredundant peaks have a sin- 
gle summand at the top, transitivity holds for all instances whose middle term 
contains only summands that do not exceed the bound individually, even if the 
sum as a whole is greater than the bound. 

This extension is necessary to prove that isolation for commutative rings 
is compatible with the notion of redundancy, where isolation is the following 
inference: 



CK-Isolation 



[-']{m(j)i -I- s « n4>2 +t) V C 
— n)4>i t — s) V C 



if (i) 4>i =AC 4>2, (ii) 4>i is a product, (iii) 4>i is irreducible with respect 
to CR, (iv) m > n, (v) n 4 0 or s 4 and (vi) 4>i y s and (fi >- t. 



Here the most difficult case arises when maximal summands in a negative literal 
must be cancelled. In this case we have to prove 



{m(j) + s ~ n(j) + t}U Trans £) \=i (m — l)(j) -I- s « (n — 1)^ -|- t, 

where we must be careful not to exceed the bound D on transitivity, which is 
dominated by the maximal summand m(f> or nf). Without this bound it would be 
sufficient to add — ^ on both sides and to normalize with respect to the rewrite 
system CR for commutative rings. Since —f) exceeds the bound we have to be 
more careful. In the candidate model we have a rewrite proof 

rruf) -I- s nof) -|- ri -|- s JJ- no4> + V 2 + t 4= ncf + t, 



where reductions of maximal terms are done first and no(j) is not reduced. We 
obtain rewrite proofs for (1) ri -|- s « r 2 -I- t from the middle part and (2) ri « 
(m — n)4> + T 2 from the peak formed by the reductions of the maximal terms. 
Using contexts — ri -|-(m — 1)(/)-|- [] on (1) and — [] -|-(m — l)(/)-|-r 2 -l-t on (2) these 
can be combined to a suitable proof. The main idea to stay below the bound is 
to first reduce the maximal terms and then cancel the resulting smaller term rp. 
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By normalizing each term in this proof with respect to CR we obtain a proof 
that contains (m— !)(/) as its maximal monomial, hence transitivity holds for all 
terms in the proof and we obtain a valley proof p = {m — 1)4> + s JJ- (n — 1)^ + 1. 
Note that the normalization removes all occurrences of —4> in the dashed part 
of the proof. 

Apart from Isolation the simplification function consists of rewriting with CR, 
which does not pose problems with respect to the bounded validity of transitivity. 
Simplification by rewriting may be limited such that rewriting always involves 
maximal summands, i.e., either reduce maximal summands, or cancel maximal 
summands by the inverse law. Other inferences like Superposition and Equality 
Factoring operate only on left-hand sides, which also restricts them to maximal 
terms. With these remarks the ground calculus for commutative rings can easily 
be derived from the general case. 



11 Conclusion 

We have presented an approach for constructing refutationally complete ground 
calculi for theories that can be represented by convergent term rewriting systems. 
This approach is suitable in particular for theories such as abelian groups or com- 
mutative rings. For commutative rings, overlaps of AC-extensions with respect to 
multiplication lead to problems with transitivity, and in turn to difficulties in the 
completeness proof. These were overcome by developing a redundancy criterion 
for these extension peaks, and by considering the effect of these redundancies on 
the validity of transitivity. 
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Abstract. Right-linear finite path overlapping TRS are shown to ef- 
fectively preserve recognizability. The class of right-linear finite path 
overlapping TRS properly includes the class of linear generalized semi- 
monadic TRS and the class of inverse left-linear growing TRS, which are 
known to effectively preserve recognizability. Approximations by inverse 
right-linear finite path overlapping TRS are also discussed. 



1 Introduction 

Much effort has been devoted to finding subclasses of TRSs which have reason- 
able computational power and for which important problems are decidable and, 
if possible, efficiently solvable. Tree automata inherit many favorable properties 
of finite-state automata on strings[5]. For a tree automaton M, let L{M) be the 
set of terms accepted by M. A set T of terms is recognizable if there is a tree 
automaton M with T = L{M). The class of recognizable sets is closed under 
boolean operations (union, intersection and complementation), and the empti- 
ness problem is decidable for a recognizable set. If TRSs and recognizable sets of 
terms can be related appropriately, then the favorable properties of recognizable 
sets help us solve some problems in TRSs. 

Two different directions for relating TRS and recognizable sets exist. One 
direction is the study of a TRS which effectively preserves recognizability [2, 6, 
7, 8, 11, 13]. For a TRS R and a set L of terms, define R*{L) = {t | 3s G 
L s.t. s — t}. A TRS R is said to effectively preserve recognizability if, for 
any tree automaton M, R*{L{M)) is also recognizable and a tree automaton 
M* such that R*{L{M)) = L{M^) can be effectively constructed. Joinability, 
reachability and local confluence are decidable for a TRS which effectively pre- 
serves recognizability [7, 8]. Since it is undecidable whether a given TRS effec- 
tively preserves recognizability or not [6], decidable subclasses of TRSs which 
effectively preserve recognizability have been investigated. Such classes include 
ground TRS[1], right-linear monadic TRS [13], linear semi-monadic TRS [2] and 
linear generalized semi-monadic TRS [8]. Another direction of the study for re- 
lating TRS and recognizable sets is to find a class of TRS R such that the set 
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R~^*{L{M)) = {t I 3s G L{M) s.t. t — s} is recognizable for any tree au- 
tomaton M[4, 10, 12]. A linear growing TRS[10] has this property, and later, 
the result was extended to left-linear growing TRS[12j. Obviously, if a TRS R 
has this property, then R~^ = {I ^r\r^l & i?} preserves recognizability, 
and vice versa. A TRS R is (right-)linear semi-monadic if and only if R~^ is 
(left-)linear growing except that the variable restriction (Z is not a variable and 
Var{r) C Var{l) for each I ^ r G R) is dropped in [10, 12]. 

In this paper, a new class of TRSs, right-linear finite path overlapping TRS 
is proposed. A TRS in the class effectively preserves recognizability (Section 4), 
and the class properly includes known decidable classes of TRSs which effec- 
tively preserve recognizability (Section 3). Section 5 discusses approximations 
by inverse right-linear finite path overlapping TRS. 



2 Preliminaries 

2.1 Term Rewriting Systems 

We use the usual notions for terms, substitutions, etc (see [3] for details). Let 
J- he & finite set of function symbols and X an enumerable set of variables. 
The arity of / G is denoted by a{f). T is called a signature. The set of 
tF-terms, or simply terms, defined in the usual way, is denoted by T{tF,X). 
The set of variables occurring in t is denoted by Var{t). A term t is ground if 
Var{t) = 0. The set of ground iT-terms is denoted by G(R). A term is linear if 
no variable occurs more than once in the term. A substitution u is a mapping 
from X to TifF , X), and written as <t = {xi i— > ti, . . . , a;„ i— > where ti with 

1 < i < n is a term which substitutes for the variable Xi. The term obtained by 
applying a substitution cr to a term t is written as ta. ta is called an instance of 
t and t is said to subsume ta. An occurrence (or position) in a term t is defined 
as a sequence of positive integers as usual, and the set of all the occurrences 
in a term t is denoted by Occ{t). Let A denote the empty occurrence. If an 
occurrence oi is a prefix (resp. proper prefix) of 02 , then we write Oi ^ 02 
(resp. Oi ^ 02 ). Two occurrences oi and 02 are disjoint if neither Oi ^ 02 nor 

02 di oi. A subterm of t at an occurrence o is denoted by t/o. t/o is said to 
occur at depth jo]. The depth of a term t is max{jo] ] o G Occ{t)}. If a term 
t is obtained from a term t' by replacing the subterms of t' at occurrences 
0\, . . .,0m {oi G Occ{t'), Oi and Oj are disjoint ii i ^ j) with terms t\, . . ., tm, 
respectively, then we write t — t' [oi ^ ti \ 1 < i < m]. 

A rewrite rule is an ordered pair of terms, written as I ^ r. The variable 
restriction (Var(r) C Var{l) and I is not a variable) is not assumed unless stated 
otherwise. A term rewriting system (TRS) is a finite set of rewrite rules. For 
terms t, t' and a TRS R, we write t t' if there exist an occurrence o G 
Occ{t), a substitution a and a rewrite rule I ^ r G R such that tjo — la and 
t' = t[o ^ ra]. Define (resp. ^*r) to be the refiexive and transitive (resp. 
the refiexive, symmetric and transitive) closure of ^r. The subscript R of ^r 
is omitted if R is clear from the context. A redex {in R) is an instance of I 
for some / — > r G i?. A normal form (in R) is a term which has no redex as its 
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subterm. Let NF/j denote the set of all ground normal forms in R. A rewrite rule 
I ^ r is left-linear (jesp. right-linear) if I is linear (resp. r is linear). A rewrite 
rule is linear if it is left-linear and right-linear. A TRS R is left-linear (resp. 
right-linear, linear) if every rule in R is left-linear (resp. right-linear, linear). A 
TRS R is orthogonal if R is left-linear and has no critical pair. For a TRS R, 
let R~^ denote {I ^ r \ r ^ I G R}. For a class C of TRSs, let C~^ denote the 
class of TRSs {R \ R~^ G C}. 

Reachability , joinability, unifiability , unifier, most general unifier, conflu- 
ence, local confluence are defined in the usual way. For a TRS R, two terms ti 
and t 2 are R-unifiable if there exists a substitution a such that t\u t^o. 



2.2 Tree Automata 

A tree automaton[5] is defined by a 4-tuple M = {T, Q, Qfinai, Li) where T is 
a signature, Q is a finite set of states, Qfinai C Q is a set of final states, and 
Z\ is a finite set of transition rules of the form f{q\, . . . , qn) q where f G T , 
a{f) = n, and qi, . . ., q„, q G Q or of the form q' ^ q where q, q' G Q. The latter 
one is called an £-rule. Consider the set of ground terms Q{TiJ Q) where a{q) — 0 
for q G Q. A move of a tree automaton can be regarded as a rewrite relation on 
Q{T G) Q) by regarding transition rules in A as rewrite rules on Q {if U Q). For 
terms t and t' in Q{TG Q), we write t Gm t' if and only if t t' . The reflexive 
and transitive closure of Gm is denoted by h^. For a tree automaton M and 
t G G{tF), if t qf for a final state qj G Qfinai, then we say t is accepted by M. 
The set of ground terms accepted by M is denoted by T(M). A set T of ground 
terms is recognizable if there is a tree automaton M such that T = L{M). Also 
let L{M{q)) = {t \ t q} for a state q of M. Recognizable sets inherit some 
useful properties of regular (string) languages [5]. 

Lemma 1. The class of recognizable sets is effectively closed under union, in- 
tersection and complementation. For a recognizable set L, the following problems 
are decidable. (1) Does a given ground term t belong to L? (2) Is L empty? □ 

3 TRS which Preserves Recognizability 

3.1 Definition and Known Results 

For a TRS R and a set T of ground terms, define R*(T) = {t | 3s G T s.t. s — 
t}. A TRS R is said to effectively preserve recognizability if, for any tree automa- 
ton M , the set R*{L{M)) is also recognizable and we can effectively construct a 
tree automaton which accepts R* (L{M)) . In this paper, the class of TRSs which 
effectively preserve recognizability is written as EPR-TRS. 

Theorem 1. If a TRS R belongs to EPR-TRS, then the reachability relation 
and the joinability relation for R are decidable[7j. It is also decidable whether R 
is locally confluent or not[8]. □ 
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Theorem 2. For a confluent R € EPR-TRS and linear terms t\ and t 2 with 
Var{ti) n Var{t 2 ) = 0, it is decidable whether t\ and t 2 are R-unifiable or not. 

Proof. Since R is eonfluent, t\ and t 2 are R-unifiable if and only if there exist 
a ground substitution a and a term v such that tia — v and t 2 (J — v. For 
a term t, let I ft) denote the set of ground instances of t, i.e., I ft) = {f \ 
Bground substitution a with t' = tu}. Then t\ and t 2 are R-unifiable if and only 

if 

R*iI{F))nR*{I{t2))^(b (3.1) 

since Varfti)fWarft 2 ) — 0- R is easy to see that I ft) is recognizable for any linear 
termt. Thus R*{Ifti)) and R*{Ift 2 )) are recognizable since i? G EPR-TRS. By 

Lemma 1, the condition (3.1) is decidable. □ 

Unfortunately it is undecidable whether a given TRS belongs to EPR-TRS 
or not [ 6 ]. Therefore decidable subclasses of EPR-TRS have been proposed, for 
example, ground TRS by Brained[l], right-linear monadic TRS (RLM-TRS) by 
Salomaa[13], linear semi-monadic TRS (LSM-TRS) by Coquide et al.[2], and lin- 
ear generalized semi-monadic TRS (LGSM-TRS) by Gyenizse and Vagv61gyi[8]. 
Note that these papers assume the variable restriction. 

Theorem 3. ground TRS C RLM-TRS C EPR-TRS, and ground TRS C LSM- 
TRS C LGSM-TRS C EPR-TRS. □ 

Similar discussions can be found in [10], [4] and [12]. A TRS R (without the vari- 
able restriction) is growing if all variables in Var(l) n Varfr) occur at depth 0 or 
1 in I for every rewrite rule I r in i?[10]. Nagaya and Toyama [12] showed that 
for each left-linear growing TRS (LLG-TRS) R, R~^ effectively preserves recog- 
nizability. Note that if a TRS R satisfies the variable restriction then R is (linear, 
right-linear) semi-monadic if and only if R~^ is (linear, left-linear) growing and 
the left-hand side of every rewrite rule in R is not a constant. LLG-TRS”^ 
properly includes both of RLM-TRS and LSM-TRS and is incomparable with 
LGSM-TRS. 



3.2 Finite Path Overlapping TRS 

A new class of TRS named finite path overlapping TRS (FPO-TRS) is proposed 
in this section without assuming the variable restriction. As we will show in 3.3, 
the class of right-linear FPO-TRS properly includes the class of right-linear gen- 
eralized semi-monadic TRS and LLG-TRS~^. It is also shown in Section 4 that 
a right-linear FPO-TRS (without the variable restriction) effectively preserves 
recognizability. To the authors’ knowledge, the proposed class is the largest de- 
cidable subclass of EPR-TRS. To define the class, some additional definitions 
are necessary. We say that a term s sticks out oftift is not a variable and there 
is a variable occurrence 7 (y^ A) of t such that 

1. for any occurrence o with A ^ o ^ 7 , we have o G Occ(s) and the function 
symbol of s at o and the function symbol of t at o are the same, and 
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Fig. 1 . The sticking out relations of rewrite rules. 



2. 7 e Occ(s) and s/7 is not a ground term. 

When the occurrence 7 is of interest, we say that s sticks out of t at 7. If s 
sticks out of t at 7 and s/7 is not a variable (i.e. s/7 is a non-ground and non- 
variable term), then s is said to properly stick out oft (at 7). For example, a 
term f{g{x),a) sticks out of f{g{y),b) at the occurrence 1 • 1, and f{g{g{x)),a) 
properly sticks out of f{g{y),b) at the occurrence 1 • 1 . A finite path overlapping 
TRS (FPO-TRS) is a TRS R such that the following sticking-out graph of R 
does not have a cycle of weight one or more. 

Definition 1 . The sticking-out graph of a TRS R is a directed graph G = (V, E) 
where V — R (i.e. the vertices are the rewrite rules in R) and E is defined as 
follows. Let V\ and V2 be (possibly identical) vertices which correspond to rewrite 
rules l\ ri and I2 r2, respectively. Replace each variable in Var{ri)\Var{li) 
with a fresh constant, say <>, for i = 1, 2. 

(i) If f’2 properly sticks out of a subterm of h, then E contains an edge from 
V2 to Vi with weight one. 

(ii) If a subterm of r 2 properly sticks out ofl\, then E contains an edge from 
V2 to i>i with weight one. 

(Hi) If a subterm of h sticks out of r 2, then E contains an edge from V2 to vi 
with weight zero. 

(iv) If li sticks out of a subterm of r2, then E contains an edge from V2 to v\ 
with weight zero. 

The four cases are illustrated in Fig. 1 . □ 

Example 1 . Let Ri = {pi.f{x,a) ^ f{h{y),x), P2- g{y) ^ f{g{y),b)}. Fig. 2 
shows the sticking-out graph of Ri . The right-hand side of p2 properly sticks out 
of the left-hand side of pi at the occurrence 1, and hence there is an edge of weight 
one from p2 to pi . The sticking-out graph also has an self-looping edge of weight 
zero at p2 since the left-hand side g{y) of p2 sticks out of f{g{y),b)/l = g{y). 
Since the variable y in pi is replaced with a constant O, the right-hand side of 
Pi does not stick out of its left-hand side. There is no other edge since there 
is no other sticking-out relation between subterms of these rewrite rules. The 
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Fig. 2. The sticking-out graph of Ri. 



sticking-out graph has a cycle of weight zero, but does not have a cycle of weight 
one or more, and hence R is finite path overlapping. 

Let i?2 = {f(x) gifigix)))}. The subterm f{g{x)) of the right-hand 
side of the (unique) rewrite rule properly sticks out of its left-hand side, as in 
condition (ii) of the definition of sticking-out graph. The sticking-out graph of i?2 
consists of one vertex and one cycle with weight one. Therefore, i?2 is not finite 
path overlapping. Note that i?2 ^ EPR-TRS since R^iifio,)}) = {^"(/(^"(a))) I 
n > 0} is not recognizable. □ 

Remark that the sticking-out graph is effectively constructible for a given TRS 
R, and hence it is decidable whether a given TRS R is finite path overlapping 
or not (in 0{rn?"n?) time where m is the maximum size of a term in R and n is 
the number of rules in R) . 

3.3 Inclusion Relation 

Although a generalized semi-monadic TRS (GSM-TRS) was originally defined 
with the variable restriction in [8], we define GSM-TRS without the variable 
restriction to treat growing TRS, GSM-TRS and FPO-TRS in a uniform way. 

A TRS R is generalized semi-monadic if the following condition holds for 
any pair of (possibly the same) rewrite rules h r\ and I2 V2 in R. For 
i = 1,2, each variable in Var{ri) \ Var{li) is replaced with a fresh constant. For 
any occurrences a G Occ{li) and (3 G Occ{r2) such that a = A or /3 = A and for 
any term I3 which subsumes h/a, if r2//3 and I3 are unifiable, then 

1. ^i/a is a variable, or 

2. for any 7 G Occ{l3) such that /i/a • 7 is a variable, is a variable or a 

ground term where a is the most general unifier of r2/ (3 and Z3. 

Lemma 2. A TRS R is generalized semi-monadic if and only if the sticking-out 
graph of R has no edge with weight one. 

Proof. We show the only if part by contradiction. The if part can be shown in 
a similar way. Assume that R is a GSM-TRS and contains rules h — > ri and 
h — *■ r2 (each variable in Var{ri) \ Var{li) has been replaced with a constant O 
fori = 1,2) which satisfy condition (i) of the definition of sticking-out graph. 
In this case, there is an occurrence a G Occ{li) such that r2 properly sticks out 
of h/a. Let 7 be the variable occurrence of h/cx at which T2 properly sticks out 
ofh/oc, then h/ca ■ j is a variable and ^2/7 is a non-ground and non-variable 
term. Let I3 be the term which satisfies the following conditions. 
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— For an occurrence o with A ^ o ^ 7, I3 and l\/a have the same symbol at o, 

— a variable, say Xo, occurs at an occurrence o such that o is disjoint to 7 and 
o is written as o' • i with o' ^ 7 and 

— a variable x-^ occurs at 7. 

It is easily understood that I3 subsumes 1\/ol and that I3 and T2 are unifiable by 
an mgu a, which replaces x-^ by T2/7. Now we have (^3/7)0' = T2/7, which is 
neither a variable nor a ground term by assumption. This concludes that R is 
not a GSM- TRS. In a similar way, we can show that if any pair of rules in R 
satisfies condition (ii) of the definition of sticking-out graph, then R is not a 
GSM- TRS. □ 



Theorem 4. The class of right-linear FPO-TRS properly includes the class of 
right-linear GSM-TRS. 

Proof. The class of right-linear FPO-TRS includes the class of right-linear GSM- 
TRS by Lemma 2 . TRS R\ in Example 1 is right-linear FPO but not GSM. If 
we take li = f{x, a), r^ = f{g{y),b), a = f 3 = X and I3 = f{x, z), then r^ and I3 
are unifiable by an mgu u = {x ^ g{y)^ z 1-^ b}. Let 7=1, then lifa-'j = lifl 
is a variable x and {li/a ■ 7)17 = g{y). Therefore Ri is not a GSM-TRS. □ 

4 A Right-Linear Finite Path Overlapping TRS 
Effectively Preserves Recognizability 

4.1 Construction of Tree Automata 

In this subsection, we will present a procedure which takes a right-linear TRS 
R and a tree automaton M as an input and constructs a tree automaton M* 
such that L{M^) = R*{L{M)) if the procedure halts. In the next subsection, it 
is shown that if i? is a right-linear FPO-TRS, then the procedure always halts. 
This concludes that right-linear FPO-TRS C EPR-TRS. The procedure is an 
extension of the procedure to solve a semantic unification problem presented in 
[ 11 ]. In [ 11 ], rewrite rules are restricted so that variables appearing in the left- 
hand side more than once do not occur in the right-hand side. The restriction 
can be dropped in the following way. In the construction of M* , a term is used 
as a state of the tree automaton. To deal with non-left-linear TRS, we need 
to construct a kind of product automata whose states are Cartesian products 
of terms. To represent such Cartesian products and usual first-order terms in 
a uniform way, we introduce a packed state. Intuitively, a packed state is an 
extension of a first-order term such that a finite set of terms, rather than a 
single term, occurs at a subterm occurrence. For a signature T and a finite set 
Q, the set of packed states, denoted Pj^,q, is defined as follows. 

- {q} G P:f,q for any q G Q. 

- If Pi,P2 G then Pi Up2 G Py^,Q. 

- If / G .F and pi, . ..,Pa(f) G V:f,q, then {/(pi, . . .,Pa(/))} G P:f,q- 
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For the readability, a packed state is written as For 

example, let T = {f,g} with a{f) = 2 and a{g) = 1 and Q = {qi,q 2 }- We can 
easily verify that (/((gi), (qi)) , g{{g{{qi)) , ( 92 )))) belongs to 

Procedure 1. 

Input: a right-linear TRS R and a tree automaton M = {T, Q, Qfinai, 2\). 
Output: a tree automaton M* such that L{M^) = R*{L{M)). 

Procedure 1 has a loop-structure to expand the set Q of states and the set A of 
transition rules of M. For a nonnegative integer k let Mk be the tree automaton 
which is obtained from M by executing the loop k times. We write hfc for Fm^- 
Observe that if t G R*{L{M)) and t then t' G R*{L{M)) by the definition 

of R*{-)- Hence, if Mk accepts t, then we construct Mk+i so that it accepts t' 
by adding transition rules which simulates a rewrite step t ^ at' .In general, if 
t hy applying a rule I r and t p, where p is a state of Mk, then we 

add transition rules which make t' Ffc+i p possible. 

Step 1. Add a new state qany to Q and add a transition rule f{qany, ■ ■ ■, qany) 
qany to A for each / in T. Obviously, t \~\j qany for any t G Q{R). Let 
Mq = {R, Qo, Qfinai^ Aq) be a “packed” version of M where Qo = {{q) \ 
g G 2} C Vyr^Q, = {{q) I q G Qfinai}, and Z\o = {/((gi), • • (g„)) ^ 

( 9 ) I /( 91 , ...,qn)^q & A}U {(g') ^ ( 9 ) I 9 ' ^ 9 G A}. 

Step 2. Let k :=0. This k is used as a loop counter of the procedure. 

Step 3. Let Qk+i := Qk and Z\fc+i := Ak- 

Step 4. The set of transition rules is modified in this step. Let / ^ r be a 
rewrite rule in R. It is assumed that I has m(> 0) variables Xi, . . Xm and 
the variable Xi has 7 ^ occurrences G Occ{l) {I < j < 7 i) in 1. If there are 
states p and p( {I < i < m,l < j <%) in Qk such that 

l[ol ^ pI \ 1 < i < m,l < j < yi] \-*k p (4.1) 

and 

L{Mk{pl)) n • • • n L{Mk{pf)) ^ 0 (4.2) 

for I < i < m, then add 

Pi= U P’i (1 ^ ^ (4-3) 

to Qfc+i as new states, let p = {xi Pi \ I < i < m} U {a; 1 — > {qany) \ x G 
Var{r) \ Var{l)}, and do the following (a) and (b). 

(a) Add (rp) ^ p to Ak+i- If a move of the tree automaton is caused by 
this rule, then the move is called a rewriting move of degree fc -|- 1. 

(b) Execute ADDTRANS( (rp) ). In ADDTRANS( (rp) ), new states 
and transition rules are defined so that rp (rp). 

Simultaneously execute this Step 4 for every rewrite rule and every tuple of 
states that satisfy conditions (4.1) and (4.2). 

Step 5. Continue the loop until Ak+i = Ak- If Ak+i ^ Ak, then k ■.= k + 1 
and go to Step 3. 
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(b) a new rule defined in (ii) 
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Fig. 3. The new rules introduced by ADDTRANS. 



Step 6. Output Mfc as M*. 



□ 



Procedure 2. [ADDTRANS] This procedure takes a packed state p as an 
input. If p has already been defined as a state, then the procedure defines no 
transitions. Otherwise, the procedure first defines p as a new state of Qk+i and 
also defines transition rules as follows. It is required that if p = 

(n > 2), then each {ti) has been defined as a state. 

Case I. If p = (c) with c a constant, then define c ^ (c) as a transition rule. 

Case 2. If p = (/(pi, . . ■ ,Pa(/))) with / G .F, then define /(pi, . . . ,Pa(/)) ^ P 
as a transition rule and execute ADDTRANS(pi) for 1 < i < a(/). 

Case 3. If p = (ti, . . . , t„) (n > 2), then do the following (i) and (ii). 

(i) For each transition rule of the form p' ^ {ti) (p' G Qfc, 1 < * < ?T'), define 
a new e-rule p" ^ p and execute ADDTRANS(p") where p" is the state 
defined as p" = (p\ {ti)) Up' (see Fig. 3(a)). 

(ii) If there is a function symbol / and, for each i with 1 < i < n, there 

are states p^ with 1 < j < a(/) such that f{pu, ■ ■ - ,Pia{f)) {ti), then 
define a new rule /(pi, . . .,Pa{f)) ^ P and execute ADDTRANS(pj) for 
1 < j < a(/) where pj = pij U • • • Upnj (see Fig. 3(b)). □ 



Example 2. Let M = {T, Q, Qpnai, 2\) be a tree automaton where !F = {/, g, h, 
c} with a{f) = 2, a{g) = a{h) = 1 and a{c) = 0, Q = {go, 91 , 92 }, Qfinai = { 92 } 
and A consists of the following transition rules: 

c^9o, ft,(9o) ^ 9i, h(9i) 90, 

/(9o,9o) ^ 92, /(9i,9o) 92- 

It can be easily verified that L{M) = |/(/i"'(c), li^"(c)) | m, n > 0}. Let R — 
{f{x, x) — > g{x),g{x) — > xj. i? is a right-linear FPO-TRS. We apply Procedure 1 
to M and R. Consider the rewrite rule f{x, x) — > g{x) in Step 4 for Mo{k = 0). 
Since a move /(( 90 ), ( 90 )) ho ( 92 ) is possible, new transition rules 



(5((9o))) ^ ( 92 ) and g{{qo)) ^ (9((9o))) 
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are added to Ai. The latter rule is added in Case 2 of ADDTRANS((g((go))))- 
Next, consider the rewrite rule g(x) ^ x in Step 4 for Mi (k = 1 ). Since 
ff((<lo}) 1^1 {giiqo))) ^1 (92), (90) ^ (92) is added to A2. Thus we obtain 
h{h{c)) \~2 (90) ^2 (92) G Qfinai hcncc h{h{c)) G L{M2). We can verify that 

M3 = M2 (= M*) and T(M*) = R*{L{M)) = {g{h^^{c)) | n > 0 } U {h^^{c) \ 
n> 0 }UT(M). □ 

The following lemma states a basic property of a packed state, which is used for 
the proof of Lemma 4 . 

Lemma 3. Let Mk = {tF, Qk, Qfinai^ automaton constructed in 

Procedure 1 (k>l). Then, for each state (ti, . . .,t„) G Qk, L{Mk{{t\, . . - ,tn))) 
= T(Mfc((ti)))n---nT(Mfc((t„))). □ 

From the construction of the tree automata in Procedure 1 , the inclusion hi- 
erarchy L{M) — L{Mq) C L{Mi) C • • • holds. Procedure 1 has the following 
soundness and completeness property. See [ 14 ] for the detailed proof. 

Lemma 4. For a ground term s, s € R*{L{M)) if and only if there is an integer 
k such that s G L{Mk). 

Proof. (If part) The following claim can be shown. 

Claim A For a ground term s, and states p,p' G Qk, if there is a 

sequence of moves s p' \~k P such that 

(i) p' \~k p is a rewriting move at the root occurrence, and 

(a) there is no rewriting move in s p', 

then there is a term s' such that s' s and s' \~X-i P- 

By claim A, we can show by induction on k that for a ground term s and a state 
P, */ s P, then there is a ground term u such that u — s and u Fq p. If the 
state p is in Q final, then this implies that if s G L{Mk) then s G R*{L{M)). 

(Only if part) It can be shown that if s' — s with s' G L{Mo), then there is 
an integer k such that s G L{Mk) by induction on the length of the derivation 
s' s. □ 

The following theorem is obtained directly from Lemma 4 . 

Theorem 5. For a right-linear TRS R, if Procedure 1 halts then LIMA = 
R*{L{M)). □ 



4.2 Termination of Procedure 1 

We show that if a right-linear FPO-TRS is given to Procedure 1 , then there is 
an upper-bound limit on the number of states which are newly defined. Once 
the set of states saturates, then the set of transition rules also saturates and 
the procedure halts. First, as a measure of the size of a state, we introduce the 
concept of the layer of a packed state. Intuitively, the number of layers of a 
packed state is the number of right-hand sides of rewrite rules which are used 
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for defining the state. For a packed state p G Qk^ define the number of layers 
of p, denoted layer(p), as follows; (1) if p G Qo or p = (t) with t a ground 
subterm of a rewrite rule in R, then layer(p) = 0, (2) if p = U P 2 , then 

layer(p) = max{layer(pi), layer(p 2 )}, and (3) if p = {{r/o)cr) with I ^ r G R, 
o G Occ{r), Var{r/o) = {x\, . . . ,Xn} and a = {xi Pi \ I < i < n}, then 
layer(p) = 1 + max{layer(pi) | 1 < i < n}. Remark that layer(p) is not defined 
for all packed states, but all packed states in Procedure 1 are of the form (1), 
(2) or (3). Also remark that layer(p) is not always uniquely determined by this 
definition. If different values are defined as layer(p), then we choose the minimum 
among the values as layer(p). We note that if Xi G Var{r) \ Var{l), then pi = 
{<lany) and layer(pi) = 0. This means that variables which occurs in the right- 
hand side only are ignored for defining the number of layers. 

Example 3. Consider the states of the tree automata in Example 2. Let I r = 
f{x,x) g{x) G R, o = X and a = {x ^ (9o)} in the above definition (3). 
Then, p = \{r/o)a) = {g{{qo))) and layer(p) = layer((go)) + 1 = 1. □ 

Lemma 5. For any non-negative integer j, the number of packed states which 
have j or less layers is finite. 

Sketch of proof. The lemma can be shown by induction on j. □ 

In the following, it is shown that if i? is a right-linear FPO-TRS, then 
layer(p) < |i?| for any state p defined by Procedure 1 where \R\ is the num- 
ber of rewrite rules in R. An outline of the proof is as follows. First we associate 
each rule in R with a non-negative integer called a rank. If R is finite path over- 
lapping, then the rank is well-defined and is less than \R\. Next, it is shown that 
if a rule with rank j is used in Step 4 of Procedure 1, then layer (p) < j + 1 for any 
state p defined in the same step. The rank of a rule in R is defined based on the 
sticking-out graph G = (V, E) of R. Let v be the vertex of G which corresponds 
to a rewrite rule I ^ r in R. The rank of Z ^ r is the maximum weight of a 
path to V from any vertex in P. If i? is finite path overlapping, then the rank 
of any rewrite rule is a non-negative integer less than \R\. For Ri in Example 1, 
the ranks of pi and p 2 are one and zero, respectively, since there is an edge with 
weight one from p 2 to pi . 

Lemma 6. Let I r be a rewrite rule and p = {xi Pi \ I < i < m} U {a; i— > 
(qany) \ X G Var(r) \ Var{l)} be a substitution which are used in Step 4 of 
Procedure 1. If the rank of I r is j, then layerfpi) < j for each 1 < i < m. □ 

Before presenting a proof of the lemma, we first see how the number of layers 
of the state changes by a move of the tree automaton. A move of the tree 
automaton is either an e-move or a non-e-move. Transition rules used for e-moves 
are e-transition rules of the original tree automaton M, or rules for rewriting 
moves defined in Step 4(a) of Procedure 1, or the rules defined in Case 3(i) of 
ADDTRANS procedure. In all three cases, it can be shown that the number of 
layers in a state which is associated with the head does not increase. Transition 
rules used for non-e-moves are non-e-transition rules of M, or the rules defined 
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Fig. 4. The number of layers of a state of Mk in the sequence (4.1). 



in Cases 1, 2 or 3(ii) of ADDTRANS. In all cases, the number of layers in a 
state is increased by one or not changed by the move. 

Proof of Lemma 6. The proof is by induction on the loop variable k of Proce- 
dure 1. When k = 0, every state belongs to Qo and layer(pi) = 0 for 1 < i < n, 
and the lemma holds for any j. Assume that the lemma holds for k < n — 1, 
and consider the case with k = n. The inductive part is shown by contradiction. 
Without loss of generality, let pi be a state such that layer(pi) > j -|- 1. Since 
Pi = Ui<z<,, p{, we can assume p\ is the state such that layer(p}) > j -I- 1 
without loss of generality. Let us observe how the number of layers of the state 
changes as the head of Mk moves from o\ to the root in the sequence (4.1) of 
moves. There are four different cases. 

1. The number of layers decreases at a certain occurrence. Let o be the inner- 
most occurrence among such occurrences. There are two different subcases: 

(a) The number of layers does not increase at any o' with o < o' < o\. 

(b) There is an occurrence o' with o < o' < o\ such that the number of 
layers increases at o' . 

2. The number of layers does not decrease at all. There are two subcases: 

(a) The number of layers does not increase at any o' with X ^ o' < o\. 

(b) There is an occurrence o' with A ^ o' -< o) such that the number of 
layers increases at o'. 

These four cases are illustrated in Fig. 4. 

Assume that the number of layers changes as in case 1(a) above. In this 
case we can derive a contradiction as follows (See [15] for the precise and more 
formal proof). From the observation before this proof, we know that the number 
of layers decreases at o only if an e-move occurs at o. We can furthermore show 
that this e-move is caused by a transition rule for rewriting moves. Let I' r' be 
the rewrite rule used for defining this transition rule in Step 4 of Procedure 1. 
Then, the state just before the e-move occurs at o can be written as {r'p'). 
Remark that layer((r'p')) = layer(p[) > j + l since the number of layers changes 
as in case 1(a). This implies that the substitution p' replaces a variable in r' with 
a state which has j or more layers. Therefore, by using the inductive hypothesis, 
the rule I' r' must have rank j or more. On the other hand, the fact that 
the number of layers does not increase at o' with o ^ o' ^ o\ implies that r' 
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properly sticks out of //o as in condition (i) of the definition of sticking-out graph 
(Definition 1). Hence, the rank of / — ^ r must be larger than the rank of I' ^ r' , 
and consequently must be j ’ + 1 or more, a contradiction. 

For case 2(a), it can be shown that there is a rewrite rule V r' with 
rank j or more, and a subterm of r' sticks out of I as in condition (ii) of the 
definition of the sticking-out graph. For cases 1(b) and 2(b), it can be shown 
that there is a rewrite rule V r' with rank j -f 1 or more, and the rule satisfies 
conditions (iii) and (iv) of the definition of sticking-out graph, respectively. In 
either case, a contradiction is derived. Hence, the inductive part is shown and 
the proof completes. □ 

For a right-linear FPO-TRS i?, the rank of every rule is less than |i?| and 
hence the number of layers of any packed state is \R\ or less by Lemma 6. By 
Lemma 5, the number of packed states is finite and the following theorem holds. 



Theorem 6. Procedure 1 halts for a right-linear FPO-TRS. □ 

In general, the running time of Procedure 1 is exponential to both of the size of 
a TRS R and the size of a tree automaton M. 

Corollary 1. LLG-TRS-^ C RLGSM-TRS C RLFPO-TRS C EPR-TRS, 
where RLGSM-TRS and RLFPO-TRS are the classes of the right-linear GSM- 
TRS and the right-linear FPO-TRS, respectively. 

Proof. LLG-TRS“^ C RLGSM-TRS can easily be shown by definition. RLGSM- 
TRS C RLFPO-TRS is by Theorem 4. RLFPO-TRS C EPR-TRS is by Theo- 
rems 5 and 6. □ 

5 Decidable Approximations 

In this section, we investigate decidable approximations of TRS along the lines 
of [4, 10, 12]. A TRS R' is an approximation of a TRS R if — C — s-^, and 
NF n = NF fl/ . An approximation mapping a is a mapping from TRSs to TRSs 
such that a{R) is an approximation of R for any TRS R. For a class C of TRSs, 
a C approximation mapping is an approximation mapping such that a{R) G C 
for every TRS R. 

Jacquemard[10] introduced a linear growing approximation mapping. Later 
Nagaya and Toyama [12] introduced a better approximation called a left-linear 
growing approximation mapping and presented decidable results on them. An 
RLFPO-TRS“^ approximation mapping a is such that for a TRS R, a replaces 
some variables in the right-hand side r 2 of a rewrite rule I 2 r 2 in R~^ with 
a new variable which is not in Var(f 2 ), so that r 2 cannot contribute to an edge 
in the sticking-out graph of a(i?“^). For example, replacing variable x with x' 
in the right-hand side of the rule in R 2 of Example 1 yields an RLFPO-TRS” ^ 
approximation of Rf^. The following results are a generalization of [12]. 

Let a be an approximation mapping and 12 be a fresh constant. A redex 
at an occurrence p in t G Q{R) is a-needed if there exists no s G NF/j such 
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that t[p ^ f2] s contains no This definition is due to [4]. If 

R is orthogonal, then every a-needed redex is a needed redex in the sense of 
Huet and Levy [9]. Let CBN-NFc = {i? | every term t ^ NF/j has an a-needed 
redex }. By Theorems 15 and 29 in [4] and Lemma 1 of this paper, the following 
theorem holds. 

Theorem 7. Let R be a left-linear TRS and a be an EPR-TRS~^ approximation 
mapping. Then the following problems are decidable. (1) Is a given redex in a 
given term a-needed? (2) Is R in CBN-NFa? □ 

Corollary 2. Let R be an orthogonal TRS in EPR-TRS~^ which satisfies the 
variable restriction such that I is not a variable and Var{r) C Var{l) for every 
I r & R. 

(1) Every term t ^ NFn has a needed redex. 

(2) It is decidable whether a given redex in a given term is needed. □ 

To conclude this section, we provide an orthogonal TRS R in FPO-TRS”^ such 
that there exists no left-linear growing approximation mapping j3 which satisfies 
R e CBN-NF^. 

Example 4- Let R = {g{h{x)) — > f{x, x, x)} U R' be an orthogonal TRS where 
R' consists of the following five rewrite rules: 

f{a, b, x) a, f{b, x, a) a, f{x, a, b) a, 

f{a,a,a) a, f{b, b, b) b. 

It can be easily verified that R is in FPO-TRS”^. Every term t ^ NF/j has 
a needed redex in R by Corollary 1 and Corollary 2 (1). On the other hand, 
a left-linear growing approximation mapping /3 should be P{R) = {g{h{y)) 
f{x, x, x)} U R' for some variable y ^ x. Consider a term t = f{g{h{a)), g{h{a)), 
g{h{a))). Obviously, g{h{a)) a and g{h{a)) b. Hence, t has no 

/3-needed redex. Thus, R ^ CBN-NF p. □ 



6 Conclusion 

A new class of TRS named finite path overlapping TRS (RLFPO-TRS) is pro- 
posed. It is shown that an RLFPO-TRS effectively preserves recognizability, 
and that the class properly includes known decidable classes of TRSs which 
effectively preserve recognizability. Approximations by the proposed class are 
also discussed. RLFPO-TRS does not include simple EPR-TRSs such that R = 
{f{x) f{f{x))}. To construct a tree automaton M* which accepts R*{L{M)) 

for a given tree automaton M, we might need an operation which “merges” 
equivalent states of a tree automaton. 
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1 Introduction 

The dependency pair method refers to the approach for proving (innermost) ter- 
mination of term rewriting systems by showing that no infinite chain of so-called 
dependency pairs exists (for an overview of the method see [AGOO] ) . The method 
generates inequalities that should be satisfied by a suitable well-founded order- 
ing. Well-known techniques for searching simplification orderings (such as path 
orderings or polynomial interpretations) may be used to find such an ordering. 
The key point is that, even if the TRS is not simply terminating, the depen- 
dency pair method often generates a set of inequalities that can be satisfied by 
a simplification ordering and herewith can prove termination of the TRS. 

This paper describes a tool that implements the dependency pair approach 
with its most recent additions and refinements, such as modularity results that 
can effectively be used on larger TRSs [AG98] and operations on the dependency 
pairs such as narrowing, rewriting and instantiation of pairs. These refinements 
all increase the power of the method and turned out to be useful for larger 
examples from a verification case study [GAOO] . 

The tool is described from the user’s point of view via the interface (Sect. 2) 
and from an implementor’s point of view via the strategies that are used to deal 
with the occurring complexity problems (Sect. 3). In order to demonstrate the 
concept, some standard solutions for finding an ordering satisfying the generated 
inequalities (e.g. lexicographic path ordering [DF85, BN98]) are implemented as 
well, but the open architecture of the tool encourages that these inequalities are 
solved by separate programs. 

2 Interface 

From the perspective of the user, the tool consists of three different layers. First, 
the user has the possibility to load a certain TRS, and to indicate whether to 
prove termination or innermost termination (Fig. 1). Another alternative is to 
prove termination by proving innermost termination, but that alternative is only 
valid when the TRS is of a certain form, such as being non-overlapping [Gra95] . 
After having chosen between the two kinds of termination, the dependency pairs 
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Fig. 1. Loading a TRS and choosing the kind of termination to be proved. 

are displayed and if innermost termination is the goal, the usable rules are com- 
puted (Fig. 2). Now the user may choose to manipulate the dependency pairs 
(as far as the method allows this), by rewriting, narrowing or instantiating them 
one or many times. Consecutively, one can compute the dependency graph and 
its cycles, which results in a new window for every cycle, in which the same 
procedure can be repeated. Eventually one can activate the third layer by gen- 




Fig. 2. Dealing with one cycle in the graph, its usable rules and the chosen 
rule(s) for the AFS. 

erating the inequalities, which results in several windows, all containing a set 
of inequalities. Solving one such set, for example by the lexicographic path or- 
dering, is equivalent with proving that for the originating cycle no infinite chain 
can exist. At the moment of writing, the tool supports the lexicographic path 
ordering in two variations and can use the POLO system [G95] for polynomial 
interpretations. Implementations of other standard orderings can be accessed via 
the ‘export’ function of the tool (writing the inequalities in a file). After export- 
ing and solving the inequalities remotely, the user may claim the inequalities to 
be solved. 

In order to use a weakly monotonic variation of the ordering solving the 
inequalities, before generating them one may use an argument filtering TRS 
(AFS) to eliminate certain arguments from function symbols. 

When all the cycles have been shown to correspond to only finite chains of 
dependency pairs, the proof is finished and can be saved in a file. Alternatively 
to the manual approach, one can start a strategy which tries to perform the 
proof steps automatically. 
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3 Strategies 

The tool allows the implementation of arbitrary strategies and supports at the 
moment one fully automatic strategy. The strategy can be seen as a standard 
way of pressing the buttons of the tool. As such, the strategy can be invoked 
within any cycle, starting to apply the strategy to the dependency pairs in that 
cycle. By evaluating the strategy, basically the following happens: 

— It is checked whether the inequalities resulting from the dependency pairs 
and rules can be solved by any of the implemented orderings (without re- 
garding additional AFS rules). If several sets of inequalities are generated, 
they are examined in parallel. 

— All pairs that can be rewritten, are rewritten a fixed number of times. All 
pairs that can be narrowed are narrowed (but only one step) and if a possible 
instantiation of a pair makes the graph change, such an instantiation is 
performed. Hereafter the connected components of the cycle are re-computed 
and it is checked whether a cycle/component remains at this point. 

— If the re-computation transforms one component into several new ones, the 
strategy is repeated for every component (this is performed in parallel). 

— If the re-computation results in only one connected component, then all AFSs 
are generated and all sets of inequalities w.r.t. these AFSs are examined by 
the implemented orderings until one of the sets is satisfiable. 

— If none of the sets is satisfiable, the cycles of the component are computed 
and the strategy is recursively applied to all these cycles. 

The aim of this strategy is too postpone the most costly computations as long 
as possible. 

The complexity of the application is determined by the exponential growth of 
sets of inequalities when we consider all argument filtering TRSs and the choice 
of one strict inequality in every set of generated inequalities. For every cycle in 
the graph we have to check whether a weakly monotonic ordering exists that 
satisfies a certain set of inequalities. Such a set is constructed by demanding 
left-hand sides of both dependency pairs and (usable) rules to be greater or 
equal to the corresponding right-hand sides. Only for one of the dependency 
pairs the inequality needs to be strict. Therefore, as many sets as dependency 
pairs are generated, where in each set a different dependency pair is chosen to 
correspond to a strict inequality. For every set, all possible AFSs are generated, 
the inequalities are normalized with respect to one AFS and, for example, the 
lexicographic path ordering is applied to the normalized inequalities. 

In order to implement an efficient strategy, instead of creating all the cycles, 
the tool is more conservative and restricts to computing the connected compo- 
nents of the graph; such a component can consist of many cycles. Generating 
inequalities from a component is, pragmatically, performed by choosing all de- 
pendency pair related inequalities strict. Only when the connected component 
cannot be divided after operations on the dependency pairs, nor can the inequal- 
ities resulting from applying all possible AFSs be solved, it is decided to compute 
the cycles within the component. 
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The strategy is not guaranteed to terminate, but a time-out mechanism is 
provided and in addition the user can manually stop the evaluation of the strat- 
egy by pressing a button provided by the interface. 

4 Conclusion 

This implementation of the dependency pair method is a useful tool for those 
that try to use the method on examples that require many manipulations of the 
dependency pairs. In manual ‘mode’ one has the advantage of an accurate and 
quick experimentation technique. Finding the proofs automatically works fine for 
smaller examples and parts of other examples, but more research is necessary 
for finding a better way to check satisfiability by weakly monotonic orderings. 

For a statement about performance a measure on number of lines in a TRS 
or number of function symbols times their arity is insufficient, since it does not 
relate to the number of dependency pairs, cycles, necessary narrowing steps and 
such. By unavailability of a satisfactory measure, only examples may demon- 
strate the usefulness of the application. Compared to techniques that search for 
simplification orderings compatible with the TRS, hardly any overhead is gen- 
erated by using such techniques in the setting of the dependency pair method, 
whereas the former techniques become successfully applicable to many more 
TRSs. Compared to the prototype in Czme 2.0 [Cime], the application described 
in this paper contains the latest refinements of the dependency pair approach 
and is more user-friendly. 

The tool is available at http://www.ericsson.se/cslab/~thomas/deppairs. 
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1 Introduction 



ELAN is a powerful language and environment for specifying and prototyping 
deduction systems in a language based on rewrite rules controlled by strategies. 
It offers a natural and simple logical framework for the combination of the com- 
putation and deduction paradigms. It supports the design of theorem provers, 
logic programming languages, constraint solvers and decision procedures. 

ELAN takes from functional programming the concept of abstract data types 
and the function evaluation principle based on rewriting. In ELAN, a program 
is a set of labelled conditional rewrite rules i : I ^ r if c. Informally, rewriting 
a ground term t consists of selecting a rule whose left-hand side (also called 
pattern) matches the current term (t), or a subterm (tuj), computing a substitu- 
tion cr that gives the instantiation of rule variables {la — t^), and applying it to 
the right-hand side to build the reduced term (when instantiated conditions are 
satisfied) . In general the normalisation of a term may not terminate, or terminate 
with different results corresponding to different selected rules, selected sub-terms 
and non-unicity of the substitution a (in Associative and Commutative (AC) the- 
ories for example). So, evaluation by rewriting is essentially non-deterministic 
and backtracking may be needed to generate all results. 

One of the main originalities of the ELAN language is to provide AC opera- 
tors allowing a simpler and more concise specification, and a strategy language 
allowing the programmer to specify the control on rule applications. This is in 
contrast to many existing rewriting-based languages where the term reduction 
strategy is hard- wired and not accessible to the designer of an application. 

The strategy language offers primitives for sequential composition, iteration, 
deterministic and non-deterministic choices of elementary strategies that are 
labelled rules. From these primitives, more complex strategies can be expressed. 
In addition, the user can introduce new strategy operators and define them 
by rewrite rules. Evaluation of strategy application is itself based on rewriting. 
Moreover, it should be emphasised that ELAN has logical foundations based on 
rewriting logic [4] and detailed in [1]. So the simple and well-known paradigm 
of rewriting provides both the evaluation mechanism of the language and the 
logical framework in which deduction systems can be expressed and combined. 
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The full ELAN system (interpreter, compiler, standard library, examples, ap- 
plications and documentation) is available through the ELAN web page^. 



2 Main Design 

The ELAN system is built around the ELAN front-end and the ELAN compiler. 
The front-end accepts parametrized modules with user defined mix-fix syntax. 
The mix-fix parser consists of a lex/yacc based parser for the fixed syntax and a 
context free parser based on the Earley algorithm. The front-end also contains 
a powerful preprocessor able to automatically generate signatures, rules and 
strategies. The corresponding intermediate representation is then transformed 
and optimised (common subexpressions are factorised, complex strategy con- 
structions are translated into rewrite rules and basic strategy operators and 
then specialised by partial evaluation, etc.). This “normalised” representation is 
called Reduce Elan Format (Ref). 

In the design of the new ELAN compiler, we have attempted to make it 
as system independent as possible by modularizing its main components: the 
Reduce Elan Machine (REM), written in Java, reads a Ref program and gen- 
erates C modules; the non-deterministic library deals with choice points and 
backtracking; the runtime library defines term representation, memory manage- 
ment, internal data structure (bitmask, bipartite graph, Diophantine equation, 
etc.), and built-in operators (term comparison, string, identifier, boolean and 
integer public interfaces, etc.). 



ELAN System 




The key point is that the REM Compiler is completely independent of the ELAN 
syntax and is reusable with different systems and formalisms. Basically, a Ref 
program is a text based easily parsable prefix format which contains the mix-fix 
signature definition and some associated properties (built-in, AC, etc.), a list 
of conditional rewrite rules and a list of strategies. The only real dependence 
between those components is that an operator defined as built-in in the Ref 
program should be implemented in a module of the runtime library. To illus- 
trate this aspect, we successfully used the REM Compiler to make executable 
an ASF+SDF [3] specification. The only thing to do was to translate an ASF 
program to Ref (thanks to the /iASF format) and extend the runtime library 

^ http://www.loria.fr/ELAN 
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in order to integrate the cons and the built-in list operations used by the list- 
matching algorithm (written in ASF itself). This experiment makes the REM 
Compiler a good candidate to make executable a sub-language of Casl^. 

3 Implementation and Optimisations 

Each component of the ELAN Compiler is highly modular and has been designed 
to achieve efficiency through both high-level optimisations such as running a de- 
terministic analysis algorithm before compiling rules and strategy expressions, 
and also low-level optimisations such as representing built-in data-types by na- 
tive C integers for examples. For lack of space, we mention here only a few of the 
key optimisations. 

Many-to-One Pattern Matching. In the syntactic case as in the AC case, the ma- 
jor source of efficiency is the use of compiled deterministic discrimination trees. 
This makes the pre-selection of all applicable rules to be linear wrt. the size of 
the term to reduce. This pre-selection is useful when applying non-deterministic 
strategies and when building bipartite graphs needed in the AC case. 

Compact Bipartite Graph. The second source of efficiency of the AC matching 
algorithm is the definition of restricted classes of patterns for which a refined data 
structure of compact bipartite graph is used. This approach allows encoding, 
in a single data structure, all matching problems relative to a set of rewrite 
rules and improves the rule selection process (by minimising the bipartite graph 
construction cost, the number of matching attempts and the memory allocation). 

Deterministic Analysis. The third optimisation that is crucial to the compilation 
scheme is the definition of an analysis algorithm which statically detects deter- 
ministic computations. With this approach, the search space size, the memory 
usage, the number of necessary choice points, and the time spent in backtrack- 
ing and memory management can be considerably reduced. We can also benefit 
from the deterministic analysis to improve the efficiency of AC-matching and to 
detect some non-terminating strategies. In practice, this analysis often reduces 
the number of simultaneous active choice points (and the needed memory) to 
a constant and increases the efficiency from a factor 3 to 30 depending on the 
input specification (for instance, 3 for a Knuth-Bendix completion procedure, 6 
for the N-queens problem, and 30 for the Fibonacci function). The interested 
reader is invited to read [2] for more details. 

Data Structures and Memory Management. The runtime library relies on a 
Mark & Sweep garbage collector and an efficient allocation implementation using 
free lists. Practical studies have shown that recycling data structures (such as 

^ developed by the ESPRIT Working Group CoFI (Common Framework Initiative for 
Algebraic Specification and Development). 
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bipartite graphs) and integrating a generational copy collector can considerably 
reduce the time spent in memory management. 

Independently, all data structures have been carefully designed: terms can 
be shared, AC-terms are maintained in Ordered Canonical Form (flattened, or- 
dered and identical subterms are merged using multiplicities), and built-ins are 
implemented by native types and are not “wrapped” by artificial constructors. 
Internal operations are also optimised: destructive updates are performed when 
possible (unshared term not involved in non-deterministic computations) and 
bitmasks (intensively used by matching algorithms) are represented by 32-bits 
or 64-bits integers when possible, etc. 

All these optimisations are essential to get an efficient implementation and 
are fully detailed in [5, 2]. Naturally, they can be re-used and applied to any 
similar language implementation. 

The ELAN Compiler is available as part of the current ELAN 3.4 distribution 
and includes a friendly user interface. The generated programs and C-code are 
completely independent from the environment and can be edited, modified and 
integrated as components in any complex system. 



4 Applications and Experimental Results 

ELAN is an attractive framework for building advanced applications and formal 
tools. Among those, let us mention for instance the design of rules and strategies 
for constraint satisfaction problems, theorem proving tools in first-order logic 
with equality, the combination of unification algorithms and of decision proce- 
dures in various equational theories and the design of a tree automata library 
used to prove non-trivial protocol properties. 

By compiling most significant ELAN programs (which involve many non- 
deterministic computations and a lot of backtracking), we note that the result- 
ing executable specifications simulate the application of 1 to 2 million (complex 
conditional) rewrite rules per second on standard Intel or Alpha hardware (from 
30,000 to 100,000 pure AC rewrite steps per second up to 15 millions for very 
simple examples such as Fibonacci numbers). Rather than claiming that the 
ELAN Compiler is equivalent or (often) faster than most comparable implemen- 
tations (such as ASF-LSDF, Brute, CafeOBJ, CiME, Epic, Maude, OBJ, RRL, 
Smaran, Tram, etc.), the main result of this work is that highly optimised (semi)- 
compilation techniques (ASF-fSDF, ELAN and Maude) may promote rewrite 
rule based languages at the level of the best functional or logical language im- 
plementations. In order to support our intuition, we compare on two standard 
benchmarks the performance of the ELAN system^ to the Objective Caml func- 
tional programming system and the GNU Prolog logic programming system. We 
also compare ELAN with RRL, OBJ and Brute on three pure AC benchmarks 
(see [5] for more details). The experimental results are given in the table below: 

® On a Sun Enterprise running Solaris 5.6. 
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(time in second) 


OCamI 


GNU Prolog 


Elan 


RRL 


OBJ 


Brute 


Fibonacci(35) 


1.80 


- 


2.97 


- 


- 


- 


Nqueens(12) 


19.0 


57.0 


73.0 


- 


- 


- 


Prop 


- 


- 


0.43 


> 24h 


1164 


1.78 


Bool3 


- 


- 


0.18 


> 4h 


> 24h 


2.25 


SumlOO 


- 


- 


1.32 


- 


> 24h 


6.25 
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1 Introduction 

In the last decade, the automatic termination analysis of logic programs has been 
receiving increasing attention. Among other methods, techniques have been pro- 
posed that transform a well-moded logic program into a term rewriting system 
(TRS) so that termination of the TRS implies termination of the logic program 
under Prolog’s selection rule. In [Ohl99] it has been shown that the two-stage 
transformation obtained by combining the transformations of [GW93] into de- 
terministic conditional TRSs (CTRSs) with a further transformation into TRSs 
[CR93] yields the transformation proposed in [AZ96] , and that these three trans- 
formations are equally powerful. In most cases simplification orderings are not 
sufficient to prove termination of the TRSs obtained by the two-stage transfor- 
mation. However, if one uses the dependency pair method [AGOO] in combination 
with polynomial interpretations instead, then most of the examples described in 
the literature can automatically be proven terminating. Based on these obser- 
vations, we have implemented a tool for proving termination of logic programs 
automatically. This tool consists of a front-end which implements the two-stage 
transformation and a back-end, the CzME system [CiM], for proving termination 
of the generated TRS. Experiments show that our tool can compete with other 
tools [DSV99] based on sophisticated norm-based approaches. 

2 The Front-End 



TALP takes a Prolog program and a query as input and proceeds in four steps: 

1. The Prolog program is translated into a logic program V. In this process, 
clauses with if -then-structures, disjunctions, or negated atoms are trans- 
lated into new clauses. For instance, the clause A ^ B, not C, D is replaced 
with the clauses A ^ B,C, fail and A ^ B, D. Cuts are ignored. 



L. Bachmair (Ed.): RTA 2000, LNCS 1833, pp. 270-273, 2000. 
© Springer- Verlag Berlin Heidelberg 2000 




TALP: A Tool for the Termination Analysis of Logic Programs 271 



2. The query determines which of the arguments in its predicates are used 
as input and output, respectively. According to this information, the tool 
tries to generate a moding for the logic program such that the program is 
well-moded. If this step is successfully completed, the logic program will be 
transformed into a TRS as follows. 

3. Every atom A — p{t\, . . . ,tn) with input positions i\, . . ,,ik and output 
positions ifc+i, ■ ■ - ,in associates with a rewrite rule 



P(^) — Pin idii^ ■■■ Aik) ^ Pout {tik+n ■ ■ • ) ) 

and every program clause C = A <— i?i , . . . , Bm is transformed into a con- 
ditional rewrite rule p{C) = p{A) <j= p{Bi), . . . , p{Bjn)- The CTRS TZ-p = 
{ p{C) \ C & V} obtained in this way is deterministic because the logic 
program V is well-moded. 

4. Every rule I ^ r c G TZp with n conditions in c is transformed into n-l-1 
unconditional rewrite rules (cf. Sect. 4): 



U{l^r^c) = 



{I ^ r}, if c is empty 

{I u{s, T)} U U{u{t, x) ^ r c'), if c = s ^ t, c' 
where u is a fresh function symbol and 

X = Var(/) n (Var(t) U Var(c') U Var(r)) 



3 The Back-End CiME 

The back-end tries to prove termination of the rewrite systems generated by 
the front-end. This back-end is a pre-release of version 2 of the rewrite tool 
CzME [CM96] which is available as an alpha version [CiM]. TALP uses one 
specific method available in CzME for proving termination: the dependency pair 
method [AGOO] in combination with polynomial interpretations [Gie95] . To be 
precise, GzME performs the following steps: 

1. It computes the estimated dependency graph of the rewrite system. 

2. From the cycles in that graph, it computes a set of constraints of the form 
t\ > t 2 or t\ > t 2 , that have to be satisfied by a weakly monotonic reduction 
ordering. 

The next goal is to find such an ordering, which is done as follows [GMT99]: 

3. With each symbol / in the signature, say of arity n, it associates a parametric 
polynomial interpretation of the simple linear form Pf{x \, . . . , Xn) = a\Xi + 

■ ■ ■ + einXn C. 

4. Every constraint is translated into constraints on polynomials, and then into 
non-linear Diophantine constraints over the a^’s and c’s, by means of some 
(incomplete) positiveness criteria [HJ98]. 

5. The Diophantine constraints are solved for variables in the interval 
where R is a bound for coefficients given by the user, by using finite domain 
constraint solving techniques [BG93] . 
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4 Example 

If TALP gets the following Prolog program with query flatCm, out) as input 
f lat (niltree , nil). 

flat(tree(X, niltree, T) , cons(X, L)) flat(T, L). 
flat(tree(X, tree(Y, Tl, T2) , T3) , L) 
flat(tree(Y, Tl, tree(X, T2, T3)), L). 

then the first transformation yields the CTRS 

flatin(niltree) ^flatout(nil) 

flatin(tree(a;, niltree, t))^flatout(cons(a;,^)) flatin(t) ^ flatout(0 
flatin(tree(a;, tree{y, ti,t 2 ),h))-^f\atout{l) 

flatin(tree(j/,ti,tree(a;,t2,t3))) ^ flatout(0 

and the second transformation yields the unconditional TRS 

flatin(niltree) ^flatout(nil) 
flatin(tree(a;, niltree, t)) — > Ui(flatin(t), x) 

Ui(flatout(0) 3 ;) -^flatout(cons(x, 1)) 
flatin(tree(x, tree(j/, ti,t 2 ),h)) U 2 (flatin(tree(j/, ti, tree(x, t 2 , ta)))) 

U2(flatout(0)^flaW(0 

Subsequently CzME is asked to find a linear polynomial interpretation with 
coefficients in the interval [0; 2]. It generates the following interpretation 

[nil] =0 [flatoutl(xo) =0 [ui|(xo, Xi) =0 

[niltree] =0 [u 2 ](xo) =0 [tree](xo, Xi, X 2 ) = X 2 + 2xi + f 

[flatin](xo) = 0 [cons](xo,xi) = 0 [FLATin](xo) = xo 

and the induced polynomial ordering satisfies all constraints obtained from the 
cycles in the estimated dependency graph. 

5 Experimental Results 

As in [DSV99], we have tested the TALP system on well-known examples (the 
benchmarks are collected in [LS97]). A Web interface for TALP is available at 
h.ttp://bibiserv. tech.fak.uni-bielefeld.de/talp/. Overall, our results are 
comparable to those reported in [DSV99] but there are examples for which TALP 
succeeds and other tools don’t (e.g. the example in Sect. 4 and Example 2.3.1 in 
[Plii90]) and vice versa. During our experiments we made the following obser- 
vations. For all those examples for which CzME was able to find a termination 
proof, it was also able to generate a suitable linear polynomial interpretation with 
coefficients in the interval [0; 2]. The restriction to linear polynomial interpreta- 
tions seems to be a very good heuristic because whenever searching for a linear 
interpretation fails, then searching for a more general one, like simple-mixed, 
does not succeed either. Table 1 contains the execution times of the back-end 
for finding a termination proof on a Sun SPARCstation 10. 
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Program 


Query 


Time Ref. 


permutation 


perm(i , o) 


0.64 sec. [Plii90] 1.2 


transitivity 


p(i,o) 


0.19 sec. [Plii90] 2.3.1 


quicksort 


qsort (i , o) 


1.73 sec. [Plii90] 6.1.1 


mult 


mult (i , i , o) 


0.22 sec. [Plii90] 7.2.9 


mergesort 


mergesort (i ,o) 


3.03 sec. [Plii90] 8.2.1a 


flat 


flat (i , o) 


2.19 sec. [AZ96] 



Table 1. Benchmarks 
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